Matthew Benstead
2020-Oct-16 20:34 UTC
[Gluster-users] Gluster7 GeoReplication Operation not permitted and incomplete sync
I have updated to Gluster 7.8 and re-enabled the open-behind option, but still getting the same errors... * dict set of key for set-ctime-mdata failed * gfid different on the target file on pcic-backup-readdir-ahead-1? * remote operation failed [No such file or directory] Any suggestions? We're looking at abandoning Geo-Replication if we can get this sorted out... [2020-10-16 20:30:25.039659] E [MSGID: 109009] [dht-helper.c:1384:dht_migration_complete_check_task] 0-pcic-backup-dht: 24bf0575-6ab0-4613-b42a-3b63b3c00165: gfid different on the target file on pcic-backup-readdir-ahead-0 [2020-10-16 20:30:25.039695] E [MSGID: 148002] [utime.c:146:gf_utime_set_mdata_setxattr_cbk] 0-pcic-backup-utime: dict set of key for set-ctime-mdata failed [Input/output error] [2020-10-16 20:30:25.122666] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 14723: LINK() /.gfid/5b4b9b0b-8436-4d2c-92e8-93c621b38949/__init__.cpython-36.pyc => -1 (File exists) [2020-10-16 20:30:25.145772] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 14750: LINK() /.gfid/b70830fc-2085-4783-bc1c-ac641e572f70/test_variance_threshold.py => -1 (File exists) [2020-10-16 20:30:25.268853] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 14929: LINK() /.gfid/8eabf893-5e81-4ef2-bbe5-236f316e7a2e/cloudpickle_wrapper.cpython-36.pyc => -1 (File exists) [2020-10-16 20:30:25.306886] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 14979: LINK() /.gfid/e2878d00-614c-4979-b87a-16d66518fbba/setup.py => -1 (File exists) [2020-10-16 20:30:25.316890] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 14992: LINK() /.gfid/7819bd57-25e5-4e01-8f14-efbb8e2bcc76/__init__.cpython-36.pyc => -1 (File exists) [2020-10-16 20:30:25.324419] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 15002: LINK() /.gfid/24ba8e1a-3c8d-493a-aec9-a581098264d3/test_ranking.py => -1 (File exists) [2020-10-16 20:30:25.463950] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 15281: LINK() /.gfid/f050b8ca-0efe-4a9a-b059-66d0bbe7ed88/expatbuilder.cpython-36.pyc => -1 (File exists) [2020-10-16 20:30:25.502174] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 15339: LINK() /.gfid/f1e91a13-8080-4c8b-8a53-a3d6fda03eb4/plat_win.py => -1 (File exists) [2020-10-16 20:30:25.543296] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 15399: LINK() /.gfid/d8cc28fd-021b-4b80-913b-fb3886d88152/mouse.cpython-36.pyc => -1 (File exists) [2020-10-16 20:30:25.544819] W [MSGID: 114031] [client-rpc-fops_v2.c:850:client4_0_setxattr_cbk] 0-pcic-backup-client-1: remote operation failed [Stale file handle] [2020-10-16 20:30:25.545604] E [MSGID: 109009] [dht-helper.c:1384:dht_migration_complete_check_task] 0-pcic-backup-dht: c6aa9d38-0b23-4467-90a0-b6e175e4852a: gfid different on the target file on pcic-backup-readdir-ahead-0 [2020-10-16 20:30:25.545636] E [MSGID: 148002] [utime.c:146:gf_utime_set_mdata_setxattr_cbk] 0-pcic-backup-utime: dict set of key for set-ctime-mdata failed [Input/output error] [2020-10-16 20:30:25.549597] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 15404: LINK() /.gfid/d8cc28fd-021b-4b80-913b-fb3886d88152/scroll.cpython-36.pyc => -1 (File exists) [2020-10-16 20:30:25.553530] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 15409: LINK() /.gfid/d8cc28fd-021b-4b80-913b-fb3886d88152/named_commands.cpython-36.pyc => -1 (File exists) [2020-10-16 20:30:25.559450] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 15416: LINK() /.gfid/d8cc28fd-021b-4b80-913b-fb3886d88152/open_in_editor.cpython-36.pyc => -1 (File exists) [2020-10-16 20:30:25.563162] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 15421: LINK() /.gfid/d8cc28fd-021b-4b80-913b-fb3886d88152/completion.cpython-36.pyc => -1 (File exists) [2020-10-16 20:30:25.686630] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 15598: LINK() /.gfid/3611cdaf-03ae-43cf-b7e2-d062eb690ce3/test_files.cpython-36.pyc => -1 (File exists) [2020-10-16 20:30:25.687868] W [MSGID: 114031] [client-rpc-fops_v2.c:850:client4_0_setxattr_cbk] 0-pcic-backup-client-1: remote operation failed [No such file or directory] [2020-10-16 20:30:25.688919] E [MSGID: 109009] [dht-helper.c:1384:dht_migration_complete_check_task] 0-pcic-backup-dht: c93efed8-fd4d-42a6-b833-f8c1613e63be: gfid different on the target file on pcic-backup-readdir-ahead-0 [2020-10-16 20:30:25.688968] E [MSGID: 148002] [utime.c:146:gf_utime_set_mdata_setxattr_cbk] 0-pcic-backup-utime: dict set of key for set-ctime-mdata failed [Input/output error] [2020-10-16 20:30:25.848663] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 15803: LINK() /.gfid/2362c20c-8a2a-4cf0-bcf6-59bc5274cac3/highlevel.cpython-36.pyc => -1 (File exists) [2020-10-16 20:30:26.022115] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 16019: LINK() /.gfid/17f12c4d-3be3-461f-9536-23471edecc23/test_rotate_winds.py => -1 (File exists) [2020-10-16 20:30:26.104627] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 16108: LINK() /.gfid/30dbb2fe-1d28-41f0-a907-bf9e7c976d73/__init__.py => -1 (File exists) [2020-10-16 20:30:26.122917] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 16140: LINK() /.gfid/ddcfec9c-8fa3-4a1c-9830-ee086c18cb43/__init__.py => -1 (File exists) [2020-10-16 20:30:26.138888] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 16166: LINK() /.gfid/4ae973f4-138a-42dc-8869-2e26a4a458ef/test__convert_vertical_coords.py => -1 (File exists) [2020-10-16 20:30:26.139826] W [MSGID: 114031] [client-rpc-fops_v2.c:850:client4_0_setxattr_cbk] 0-pcic-backup-client-1: remote operation failed [No such file or directory] [2020-10-16 20:30:26.140751] E [MSGID: 109009] [dht-helper.c:1384:dht_migration_complete_check_task] 0-pcic-backup-dht: 4dd5e90b-f854-49d3-88b7-2b176b0a9ca6: gfid different on the target file on pcic-backup-readdir-ahead-0 [2020-10-16 20:30:26.140775] E [MSGID: 148002] [utime.c:146:gf_utime_set_mdata_setxattr_cbk] 0-pcic-backup-utime: dict set of key for set-ctime-mdata failed [Input/output error] [2020-10-16 20:30:26.219401] W [fuse-bridge.c:1047:fuse_entry_cbk] 0-glusterfs-fuse: 16301: LINK() /.gfid/5df47a90-d176-441b-8ad9-5e845f88edf1/YlOrBr_09.txt => -1 (File exists) [2020-10-16 20:30:26.220167] W [MSGID: 114031] [client-rpc-fops_v2.c:850:client4_0_setxattr_cbk] 0-pcic-backup-client-0: remote operation failed [No such file or directory] [2020-10-16 20:30:26.220984] E [MSGID: 109009] [dht-helper.c:1384:dht_migration_complete_check_task] 0-pcic-backup-dht: e6d97a13-7bae-47a2-b574-e989ca8e2f9a: gfid different on the target file on pcic-backup-readdir-ahead-1 [2020-10-16 20:30:26.221015] E [MSGID: 148002] [utime.c:146:gf_utime_set_mdata_setxattr_cbk] 0-pcic-backup-utime: dict set of key for set-ctime-mdata failed [Input/output error] Thanks, ?-Matthew On 10/13/20 8:57 AM, Matthew Benstead wrote:> Further to this - After rebuilding the slave volume with the xattr=sa > option and starting the destroying and restarting the geo-replication > sync I am still getting "extended attribute not supported by the > backend storage" errors: > > [root at storage01 storage_10.0.231.81_pcic-backup]# tail > mnt-data-storage_a-storage.log > [2020-10-13 14:00:27.095418] E [fuse-bridge.c:4288:fuse_xattr_cbk] > 0-glusterfs-fuse: extended attribute not supported by the backend storage > [2020-10-13 14:13:44.497710] E [fuse-bridge.c:4288:fuse_xattr_cbk] > 0-glusterfs-fuse: extended attribute not supported by the backend storage > [2020-10-13 14:19:07.245191] E [fuse-bridge.c:4288:fuse_xattr_cbk] > 0-glusterfs-fuse: extended attribute not supported by the backend storage > [2020-10-13 14:33:24.031232] E [fuse-bridge.c:4288:fuse_xattr_cbk] > 0-glusterfs-fuse: extended attribute not supported by the backend storage > [2020-10-13 14:41:54.070198] E [fuse-bridge.c:4288:fuse_xattr_cbk] > 0-glusterfs-fuse: extended attribute not supported by the backend storage > [2020-10-13 14:53:27.740279] E [fuse-bridge.c:4288:fuse_xattr_cbk] > 0-glusterfs-fuse: extended attribute not supported by the backend storage > [2020-10-13 15:02:31.951660] E [fuse-bridge.c:4288:fuse_xattr_cbk] > 0-glusterfs-fuse: extended attribute not supported by the backend storage > [2020-10-13 15:07:41.470933] E [fuse-bridge.c:4288:fuse_xattr_cbk] > 0-glusterfs-fuse: extended attribute not supported by the backend storage > [2020-10-13 15:18:42.664005] E [fuse-bridge.c:4288:fuse_xattr_cbk] > 0-glusterfs-fuse: extended attribute not supported by the backend storage > [2020-10-13 15:26:17.510656] E [fuse-bridge.c:4288:fuse_xattr_cbk] > 0-glusterfs-fuse: extended attribute not supported by the backend storage > > When checkiung the logs I see errors around the set-ctime-mdata - dict > set of key for set-ctime-mdata failed [Input/output error] > > [root at pcic-backup01 storage_10.0.231.81_pcic-backup]# tail -20 > mnt-10.0.231.93-data-storage_b-storage.log > [2020-10-13 15:40:38.579096] W [fuse-bridge.c:1047:fuse_entry_cbk] > 0-glusterfs-fuse: 15133: LINK() > /.gfid/ba2374d3-23ac-4094-b795-b03738583765/ui-icons_222222_256x240.png > => -1 (File exists) > [2020-10-13 15:40:38.583874] W [fuse-bridge.c:1047:fuse_entry_cbk] > 0-glusterfs-fuse: 15138: LINK() > /.gfid/ba2374d3-23ac-4094-b795-b03738583765/ui-bg_glass_95_fef1ec_1x400.png > => -1 (File exists) > [2020-10-13 15:40:38.584828] W [MSGID: 114031] > [client-rpc-fops_v2.c:850:client4_0_setxattr_cbk] > 0-pcic-backup-client-1: remote operation failed [No such file or > directory] > [2020-10-13 15:40:38.585887] E [MSGID: 109009] > [dht-helper.c:1384:dht_migration_complete_check_task] > 0-pcic-backup-dht: 5e2d07f2-253f-442b-9df7-68848cf3b541: gfid > different on the target file on pcic-backup-readdir-ahead-0 > [2020-10-13 15:40:38.585916] E [MSGID: 148002] > [utime.c:146:gf_utime_set_mdata_setxattr_cbk] 0-pcic-backup-utime: > dict set of key for set-ctime-mdata failed [Input/output error] > [2020-10-13 15:40:38.604843] W [fuse-bridge.c:1047:fuse_entry_cbk] > 0-glusterfs-fuse: 15152: LINK() > /.gfid/5fb546eb-87b3-4a4d-954c-c3a2ad8d06b5/font-awesome.min.css => -1 > (File exists) > [2020-10-13 15:40:38.770794] W [MSGID: 114031] > [client-rpc-fops_v2.c:2633:client4_0_lookup_cbk] > 0-pcic-backup-client-1: remote operation failed. Path: > <gfid:c4173839-957a-46ef-873b-7974305ee5ff> > (c4173839-957a-46ef-873b-7974305ee5ff) [No such file or directory] > [2020-10-13 15:40:38.774303] W [fuse-bridge.c:1047:fuse_entry_cbk] > 0-glusterfs-fuse: 15292: LINK() > /.gfid/52e174c2-766f-4d95-8415-27ea020b7c8d/MathML.js => -1 (File exists) > [2020-10-13 15:40:38.790133] W [fuse-bridge.c:1047:fuse_entry_cbk] > 0-glusterfs-fuse: 15297: LINK() > /.gfid/52e174c2-766f-4d95-8415-27ea020b7c8d/HTML-CSS.js => -1 (File > exists) > [2020-10-13 15:40:38.813826] W [fuse-bridge.c:1047:fuse_entry_cbk] > 0-glusterfs-fuse: 15323: LINK() > /.gfid/ad758f3a-e6b9-4a1c-a9f2-2e4a58954e83/latin-mathfonts-bold-fraktur.js > => -1 (File exists) > [2020-10-13 15:40:38.830217] W [fuse-bridge.c:1047:fuse_entry_cbk] > 0-glusterfs-fuse: 15340: LINK() > /.gfid/ad758f3a-e6b9-4a1c-a9f2-2e4a58954e83/math_harpoons.js => -1 > (File exists) > [2020-10-13 15:40:39.084522] W [MSGID: 114031] > [client-rpc-fops_v2.c:2633:client4_0_lookup_cbk] > 0-pcic-backup-client-1: remote operation failed. Path: > <gfid:10e95272-a0d3-404e-b0da-9c87f2f450b0> > (10e95272-a0d3-404e-b0da-9c87f2f450b0) [No such file or directory] > [2020-10-13 15:40:39.114516] W [fuse-bridge.c:1047:fuse_entry_cbk] > 0-glusterfs-fuse: 15571: LINK() > /.gfid/8bb4435f-32f1-44a6-9346-82e4d7d867d4/sieve.js => -1 (File exists) > [2020-10-13 15:40:39.233346] W [MSGID: 114031] > [client-rpc-fops_v2.c:2633:client4_0_lookup_cbk] > 0-pcic-backup-client-1: remote operation failed. Path: > <gfid:318e1260-cf0e-44d3-b964-33165aabf6fe> > (318e1260-cf0e-44d3-b964-33165aabf6fe) [No such file or directory] > [2020-10-13 15:40:39.236109] W [fuse-bridge.c:1047:fuse_entry_cbk] > 0-glusterfs-fuse: 15720: LINK() > /.gfid/179f6634-be43-44d6-8e46-d493ddc52b9e/__init__.cpython-36.pyc => > -1 (File exists) > [2020-10-13 15:40:39.259296] W [MSGID: 114031] > [client-rpc-fops_v2.c:2633:client4_0_lookup_cbk] > 0-pcic-backup-client-1: remote operation failed. Path: > /.gfid/1c4b39ee-cadb-49df-a3ee-ea9648913d8a/blocking.py > (00000000-0000-0000-0000-000000000000) [No data available] > [2020-10-13 15:40:39.340758] W [fuse-bridge.c:1047:fuse_entry_cbk] > 0-glusterfs-fuse: 15870: LINK() > /.gfid/f5d6b380-b9b7-46e7-be69-668f0345a8a4/top_level.txt => -1 (File > exists) > [2020-10-13 15:40:39.414092] W [fuse-bridge.c:1047:fuse_entry_cbk] > 0-glusterfs-fuse: 15945: LINK() > /.gfid/2ae57be0-ec4b-4b2a-95e5-cdedd98061f0/ar_MR.dat => -1 (File exists) > [2020-10-13 15:40:39.941258] W [MSGID: 114031] > [client-rpc-fops_v2.c:2633:client4_0_lookup_cbk] > 0-pcic-backup-client-1: remote operation failed. Path: > <gfid:7752a80a-7dea-4dc1-80dd-e57d10b57640> > (7752a80a-7dea-4dc1-80dd-e57d10b57640) [No such file or directory] > [2020-10-13 15:40:39.944186] W [fuse-bridge.c:1047:fuse_entry_cbk] > 0-glusterfs-fuse: 16504: LINK() > /.gfid/943e08bf-803d-492f-81c1-cba34e867956/heaps.cpython-36.pyc => -1 > (File exists) > > Any thoughts on this? > > Thanks, > ?-Matthew > > -- > Matthew Benstead > System Administrator > Pacific Climate Impacts Consortium <https://pacificclimate.org/> > University of Victoria, UH1 > PO Box 1800, STN CSC > Victoria, BC, V8W 2Y2 > Phone: +1-250-721-8432 > Email: matthewb at uvic.ca > > On 10/5/20 1:28 PM, Matthew Benstead wrote: >> Hmm... Looks like I forgot to set the xattr's to sa - I left them as >> default. >> >> [root at pcic-backup01 ~]# zfs get xattr pcic-backup01-zpool >> NAME???????????????? PROPERTY? VALUE? SOURCE >> pcic-backup01-zpool? xattr???? on???? default >> >> [root at pcic-backup02 ~]# zfs get xattr pcic-backup02-zpool >> NAME???????????????? PROPERTY? VALUE? SOURCE >> pcic-backup02-zpool? xattr???? on???? default >> >> I wonder if I can change them and continue, or if I need to blow away >> the zpool and start over? >> >> Thanks, >> ?-Matthew >> >> -- >> Matthew Benstead >> System Administrator >> Pacific Climate Impacts Consortium <https://pacificclimate.org/> >> University of Victoria, UH1 >> PO Box 1800, STN CSC >> Victoria, BC, V8W 2Y2 >> Phone: +1-250-721-8432 >> Email: matthewb at uvic.ca >> >> On 10/5/20 12:53 PM, Felix K?lzow wrote: >>> >>> Dear Matthew, >>> >>> this is our configuration: >>> >>> zfs get all mypool >>> >>> mypool? xattr?????????????????????????? >>> sa????????????????????????????? local >>> mypool? acltype???????????????????????? >>> posixacl??????????????????????? local >>> >>> >>> Something more to consider? >>> >>> >>> Regards, >>> >>> Felix >>> >>> >>> >>> On 05/10/2020 21:11, Matthew Benstead wrote: >>>> Thanks Felix - looking through some more of the logs I may have >>>> found the reason... >>>> >>>> From >>>> /var/log/glusterfs/geo-replication/storage_10.0.231.81_pcic-backup/mnt-data-storage_a-storage.log >>>> >>>> [2020-10-05 18:13:35.736838] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>>> 0-glusterfs-fuse: extended attribute not supported by the backend >>>> storage >>>> [2020-10-05 18:18:53.885591] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>>> 0-glusterfs-fuse: extended attribute not supported by the backend >>>> storage >>>> [2020-10-05 18:22:14.405234] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>>> 0-glusterfs-fuse: extended attribute not supported by the backend >>>> storage >>>> [2020-10-05 18:25:53.971679] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>>> 0-glusterfs-fuse: extended attribute not supported by the backend >>>> storage >>>> [2020-10-05 18:31:44.571557] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>>> 0-glusterfs-fuse: extended attribute not supported by the backend >>>> storage >>>> [2020-10-05 18:36:36.508772] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>>> 0-glusterfs-fuse: extended attribute not supported by the backend >>>> storage >>>> [2020-10-05 18:40:10.401055] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>>> 0-glusterfs-fuse: extended attribute not supported by the backend >>>> storage >>>> [2020-10-05 18:42:57.833536] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>>> 0-glusterfs-fuse: extended attribute not supported by the backend >>>> storage >>>> [2020-10-05 18:45:19.691953] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>>> 0-glusterfs-fuse: extended attribute not supported by the backend >>>> storage >>>> [2020-10-05 18:48:26.478532] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>>> 0-glusterfs-fuse: extended attribute not supported by the backend >>>> storage >>>> [2020-10-05 18:52:24.466914] E [fuse-bridge.c:4288:fuse_xattr_cbk] >>>> 0-glusterfs-fuse: extended attribute not supported by the backend >>>> storage >>>> >>>> >>>> The slave nodes are running gluster on top of ZFS, but I had >>>> configured ACLs - is there something else missing to make this work >>>> with ZFS? >>>> >>>> [root at pcic-backup01 ~]# gluster volume info >>>> ? >>>> Volume Name: pcic-backup >>>> Type: Distribute >>>> Volume ID: 7af8a424-f4b6-4405-bba1-0dbafb0fa231 >>>> Status: Started >>>> Snapshot Count: 0 >>>> Number of Bricks: 2 >>>> Transport-type: tcp >>>> Bricks: >>>> Brick1: 10.0.231.81:/pcic-backup01-zpool/brick >>>> Brick2: 10.0.231.82:/pcic-backup02-zpool/brick >>>> Options Reconfigured: >>>> network.ping-timeout: 10 >>>> performance.cache-size: 256MB >>>> server.event-threads: 4 >>>> client.event-threads: 4 >>>> cluster.lookup-optimize: on >>>> performance.parallel-readdir: on >>>> performance.readdir-ahead: on >>>> features.quota-deem-statfs: on >>>> features.inode-quota: on >>>> features.quota: on >>>> transport.address-family: inet >>>> nfs.disable: on >>>> features.read-only: off >>>> performance.open-behind: off >>>> >>>> >>>> [root at pcic-backup01 ~]# zfs get acltype pcic-backup01-zpool >>>> NAME???????????????? PROPERTY? VALUE???? SOURCE >>>> pcic-backup01-zpool? acltype?? posixacl? local >>>> >>>> [root at pcic-backup01 ~]# grep "pcic-backup0" /proc/mounts >>>> pcic-backup01-zpool /pcic-backup01-zpool zfs >>>> rw,seclabel,xattr,posixacl 0 0 >>>> >>>> >>>> [root at pcic-backup02 ~]# zfs get acltype pcic-backup02-zpool >>>> NAME???????????????? PROPERTY? VALUE???? SOURCE >>>> pcic-backup02-zpool? acltype?? posixacl? local >>>> >>>> [root at pcic-backup02 ~]# grep "pcic-backup0" /proc/mounts >>>> pcic-backup02-zpool /pcic-backup02-zpool zfs >>>> rw,seclabel,xattr,posixacl 0 0 >>>> >>>> Thanks, >>>> ?-Matthew >>>> >>>> >>>> -- >>>> Matthew Benstead >>>> System Administrator >>>> Pacific Climate Impacts Consortium <https://pacificclimate.org/> >>>> University of Victoria, UH1 >>>> PO Box 1800, STN CSC >>>> Victoria, BC, V8W 2Y2 >>>> Phone: +1-250-721-8432 >>>> Email: matthewb at uvic.ca >>>> >>>> On 10/5/20 1:39 AM, Felix K?lzow wrote: >>>>> Dear Matthew, >>>>> >>>>> >>>>> can you provide more information regarding to the geo-replication >>>>> brick >>>>> logs. >>>>> >>>>> These files area also located in: >>>>> >>>>> /var/log/glusterfs/geo-replication/storage_10.0.231.81_pcic-backup/ >>>>> >>>>> >>>>> Usually, these log files are more precise to figure out the root >>>>> cause >>>>> of the error. >>>>> >>>>> Additionally, it is also worth to look at the log-files on the >>>>> slave side. >>>>> >>>>> >>>>> Regards, >>>>> >>>>> Felix >>>>> >>>>> >>>>> On 01/10/2020 23:08, Matthew Benstead wrote: >>>>>> Hello, >>>>>> >>>>>> I'm looking for some help with a GeoReplication Error in my Gluster >>>>>> 7/CentOS 7 setup. Replication progress has basically stopped, and >>>>>> the >>>>>> status of the replication keeps switching. >>>>>> >>>>>> The gsyncd log has errors like "Operation not permitted", >>>>>> "incomplete >>>>>> sync", etc... help? I'm not sure how to proceed in >>>>>> troubleshooting this. >>>>>> >>>>>> The log is here, it basically just repeats - from: >>>>>> /var/log/glusterfs/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.log >>>>>> >>>>>> >>>>>> [2020-10-01 20:52:15.291923] I [master(worker >>>>>> /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken >>>>>> duration=32.8466??????? num_files=1749? job=3?? return_code=23 >>>>>> [2020-10-01 20:52:18.700062] I [master(worker >>>>>> /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken >>>>>> duration=43.1210??????? num_files=3167? job=6?? return_code=23 >>>>>> [2020-10-01 20:52:23.383234] W [master(worker >>>>>> /data/storage_c/storage):1393:process] _GMaster: incomplete sync, >>>>>> retrying changelogs???? files=['XSYNC-CHANGELOG.1601585397'] >>>>>> [2020-10-01 20:52:28.537657] E [repce(worker >>>>>> /data/storage_b/storage):213:__call__] RepceClient: call failed >>>>>> call=258187:140538843596608:1601585515.63?????? method=entry_ops >>>>>> error=OSError >>>>>> [2020-10-01 20:52:28.538064] E [syncdutils(worker >>>>>> /data/storage_b/storage):339:log_raise_exception] <top>: FAIL: >>>>>> Traceback (most recent call last): >>>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", >>>>>> line 332, >>>>>> in main >>>>>> ???? func(args) >>>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", >>>>>> line 86, >>>>>> in subcmd_worker >>>>>> ???? local.service_loop(remote) >>>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line >>>>>> 1308, in service_loop >>>>>> ???? g1.crawlwrap(oneshot=True, register_time=register_time) >>>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", >>>>>> line 602, >>>>>> in crawlwrap >>>>>> ???? self.crawl() >>>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", >>>>>> line 1682, >>>>>> in crawl >>>>>> ???? self.process([item[1]], 0) >>>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", >>>>>> line 1327, >>>>>> in process >>>>>> ???? self.process_change(change, done, retry) >>>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", >>>>>> line 1221, >>>>>> in process_change >>>>>> ???? failures = self.slave.server.entry_ops(entries) >>>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line >>>>>> 232, in >>>>>> __call__ >>>>>> ???? return self.ins(self.meth, *a) >>>>>> ?? File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line >>>>>> 214, in >>>>>> __call__ >>>>>> ???? raise res >>>>>> OSError: [Errno 1] Operation not permitted >>>>>> [2020-10-01 20:52:28.570316] I [repce(agent >>>>>> /data/storage_b/storage):96:service_loop] RepceServer: >>>>>> terminating on >>>>>> reaching EOF. >>>>>> [2020-10-01 20:52:28.613603] I >>>>>> [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker >>>>>> Status Change status=Faulty >>>>>> [2020-10-01 20:52:29.619797] I [master(worker >>>>>> /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken >>>>>> duration=5.6458 num_files=455?? job=3?? return_code=23 >>>>>> [2020-10-01 20:52:38.286245] I [master(worker >>>>>> /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken >>>>>> duration=14.1824??????? num_files=1333? job=2?? return_code=23 >>>>>> [2020-10-01 20:52:38.628156] I >>>>>> [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker >>>>>> Status Change status=Initializing... >>>>>> [2020-10-01 20:52:38.628325] I [monitor(monitor):159:monitor] >>>>>> Monitor: >>>>>> starting gsyncd worker?? brick=/data/storage_b/storage >>>>>> slave_node=10.0.231.82 >>>>>> [2020-10-01 20:52:38.684736] I [gsyncd(agent >>>>>> /data/storage_b/storage):318:main] <top>: Using session config >>>>>> file >>>>>> path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf >>>>>> >>>>>> [2020-10-01 20:52:38.687213] I [gsyncd(worker >>>>>> /data/storage_b/storage):318:main] <top>: Using session config >>>>>> file >>>>>> path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf >>>>>> >>>>>> [2020-10-01 20:52:38.687401] I [changelogagent(agent >>>>>> /data/storage_b/storage):72:__init__] ChangelogAgent: Agent >>>>>> listining... >>>>>> [2020-10-01 20:52:38.703295] I [resource(worker >>>>>> /data/storage_b/storage):1386:connect_remote] SSH: Initializing SSH >>>>>> connection between master and slave... >>>>>> [2020-10-01 20:52:40.388372] I [resource(worker >>>>>> /data/storage_b/storage):1435:connect_remote] SSH: SSH connection >>>>>> between master and slave established. duration=1.6849 >>>>>> [2020-10-01 20:52:40.388582] I [resource(worker >>>>>> /data/storage_b/storage):1105:connect] GLUSTER: Mounting gluster >>>>>> volume >>>>>> locally... >>>>>> [2020-10-01 20:52:41.501105] I [resource(worker >>>>>> /data/storage_b/storage):1128:connect] GLUSTER: Mounted gluster >>>>>> volume >>>>>> duration=1.1123 >>>>>> [2020-10-01 20:52:41.501405] I [subcmds(worker >>>>>> /data/storage_b/storage):84:subcmd_worker] <top>: Worker spawn >>>>>> successful. Acknowledging back to monitor >>>>>> [2020-10-01 20:52:43.531146] I [master(worker >>>>>> /data/storage_b/storage):1640:register] _GMaster: Working dir >>>>>> path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage >>>>>> >>>>>> [2020-10-01 20:52:43.533953] I [resource(worker >>>>>> /data/storage_b/storage):1291:service_loop] GLUSTER: Register time >>>>>> time=1601585563 >>>>>> [2020-10-01 20:52:43.547092] I [gsyncdstatus(worker >>>>>> /data/storage_b/storage):281:set_active] GeorepStatus: Worker Status >>>>>> Change status=Active >>>>>> [2020-10-01 20:52:43.561920] I [gsyncdstatus(worker >>>>>> /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: >>>>>> Crawl Status Change???? status=History Crawl >>>>>> [2020-10-01 20:52:43.562184] I [master(worker >>>>>> /data/storage_b/storage):1554:crawl] _GMaster: starting history >>>>>> crawl???? turns=1 stime=None????? entry_stime=None??????? >>>>>> etime=1601585563 >>>>>> [2020-10-01 20:52:43.562269] I [resource(worker >>>>>> /data/storage_b/storage):1307:service_loop] GLUSTER: No stime >>>>>> available, >>>>>> using xsync crawl >>>>>> [2020-10-01 20:52:43.569799] I [master(worker >>>>>> /data/storage_b/storage):1670:crawl] _GMaster: starting hybrid >>>>>> crawl????? stime=None >>>>>> [2020-10-01 20:52:43.573528] I [gsyncdstatus(worker >>>>>> /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: >>>>>> Crawl Status Change???? status=Hybrid Crawl >>>>>> [2020-10-01 20:52:44.370985] I [master(worker >>>>>> /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken >>>>>> duration=20.4307??????? num_files=2609? job=5?? return_code=23 >>>>>> [2020-10-01 20:52:49.431854] W [master(worker >>>>>> /data/storage_c/storage):1393:process] _GMaster: incomplete sync, >>>>>> retrying changelogs???? files=['XSYNC-CHANGELOG.1601585397'] >>>>>> [2020-10-01 20:52:54.801500] I [master(worker >>>>>> /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken >>>>>> duration=72.7492??????? num_files=4227? job=2?? return_code=23 >>>>>> [2020-10-01 20:52:56.766547] I [master(worker >>>>>> /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken >>>>>> duration=74.3569??????? num_files=4674? job=5?? return_code=23 >>>>>> [2020-10-01 20:53:18.853333] I [master(worker >>>>>> /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken >>>>>> duration=28.7125??????? num_files=4397? job=3?? return_code=23 >>>>>> [2020-10-01 20:53:21.224921] W [master(worker >>>>>> /data/storage_a/storage):1393:process] _GMaster: incomplete sync, >>>>>> retrying changelogs???? files=['CHANGELOG.1601044033', >>>>>> 'CHANGELOG.1601044048', 'CHANGELOG.1601044063', >>>>>> 'CHANGELOG.1601044078', >>>>>> 'CHANGELOG.1601044093', 'CHANGELOG.1601044108', >>>>>> 'CHANGELOG.1601044123'] >>>>>> [2020-10-01 20:53:22.134536] I [master(worker >>>>>> /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken >>>>>> duration=0.2159 num_files=3???? job=3?? return_code=23 >>>>>> [2020-10-01 20:53:25.615712] I [master(worker >>>>>> /data/storage_b/storage):1681:crawl] _GMaster: processing xsync >>>>>> changelog >>>>>> path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage/xsync/XSYNC-CHANGELOG.1601585563 >>>>>> >>>>>> [2020-10-01 20:53:25.634970] W [master(worker >>>>>> /data/storage_c/storage):1393:process] _GMaster: incomplete sync, >>>>>> retrying changelogs???? files=['XSYNC-CHANGELOG.1601585397'] >>>>>> >>>>>> GeoReplication status - see it change from Active to Faulty: >>>>>> >>>>>> [root at storage01 ~]# gluster volume geo-replication status >>>>>> >>>>>> MASTER NODE??? MASTER VOL??? MASTER BRICK?????????????? SLAVE USER >>>>>> SLAVE??????????????????????????????????????? SLAVE NODE???? STATUS >>>>>> CRAWL STATUS?????? LAST_SYNCED >>>>>> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >>>>>> >>>>>> 10.0.231.91??? storage?????? /data/storage_a/storage??? geoaccount >>>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>>>> Changelog Crawl??? 2020-09-25 07:26:57 >>>>>> 10.0.231.91??? storage?????? /data/storage_c/storage??? geoaccount >>>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active >>>>>> Hybrid Crawl?????? N/A >>>>>> 10.0.231.91??? storage?????? /data/storage_b/storage??? geoaccount >>>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active >>>>>> Hybrid Crawl?????? N/A >>>>>> 10.0.231.92??? storage?????? /data/storage_b/storage??? geoaccount >>>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active >>>>>> History Crawl????? 2020-09-23 01:56:05 >>>>>> 10.0.231.92??? storage?????? /data/storage_a/storage??? geoaccount >>>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active >>>>>> Hybrid Crawl?????? N/A >>>>>> 10.0.231.92??? storage?????? /data/storage_c/storage??? geoaccount >>>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>>>> Hybrid Crawl?????? N/A >>>>>> 10.0.231.93??? storage?????? /data/storage_c/storage??? geoaccount >>>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>>>> Changelog Crawl??? 2020-09-25 06:55:57 >>>>>> 10.0.231.93??? storage?????? /data/storage_b/storage??? geoaccount >>>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>>>> Hybrid Crawl?????? N/A >>>>>> 10.0.231.93??? storage?????? /data/storage_a/storage??? geoaccount >>>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>>>> Hybrid Crawl?????? N/A >>>>>> >>>>>> [root at storage01 ~]# gluster volume geo-replication status >>>>>> >>>>>> MASTER NODE??? MASTER VOL??? MASTER BRICK?????????????? SLAVE USER >>>>>> SLAVE??????????????????????????????????????? SLAVE NODE???? STATUS >>>>>> CRAWL STATUS?????? LAST_SYNCED >>>>>> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >>>>>> >>>>>> 10.0.231.91??? storage?????? /data/storage_a/storage??? geoaccount >>>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>>>> Changelog Crawl??? 2020-09-25 07:26:57 >>>>>> 10.0.231.91??? storage?????? /data/storage_c/storage??? geoaccount >>>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active >>>>>> Hybrid Crawl?????? N/A >>>>>> 10.0.231.91??? storage?????? /data/storage_b/storage??? geoaccount >>>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A??????????? Faulty >>>>>> N/A??????????????? N/A >>>>>> 10.0.231.92??? storage?????? /data/storage_b/storage??? geoaccount >>>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active >>>>>> History Crawl????? 2020-09-23 01:58:05 >>>>>> 10.0.231.92??? storage?????? /data/storage_a/storage??? geoaccount >>>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.82??? Active >>>>>> Hybrid Crawl?????? N/A >>>>>> 10.0.231.92??? storage?????? /data/storage_c/storage??? geoaccount >>>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A??????????? Faulty >>>>>> N/A??????????????? N/A >>>>>> 10.0.231.93??? storage?????? /data/storage_c/storage??? geoaccount >>>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>>>> Changelog Crawl??? 2020-09-25 06:58:56 >>>>>> 10.0.231.93??? storage?????? /data/storage_b/storage??? geoaccount >>>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? 10.0.231.81??? Active >>>>>> Hybrid Crawl?????? N/A >>>>>> 10.0.231.93??? storage?????? /data/storage_a/storage??? geoaccount >>>>>> ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A??????????? Faulty >>>>>> N/A??????????????? N/A >>>>>> >>>>>> >>>>>> Cluster information: (Note - disabled performance.open-behind to >>>>>> work >>>>>> around https://github.com/gluster/glusterfs/issues/1440 ) >>>>>> >>>>>> [root at storage01 ~]# gluster --version | head -1; cat >>>>>> /etc/centos-release; uname -r >>>>>> glusterfs 7.7 >>>>>> CentOS Linux release 7.8.2003 (Core) >>>>>> 3.10.0-1127.10.1.el7.x86_64 >>>>>> >>>>>> [root at storage01 ~]# df -h /storage2/ >>>>>> Filesystem??????????? Size? Used Avail Use% Mounted on >>>>>> 10.0.231.91:/storage? 328T? 228T? 100T? 70% /storage2 >>>>>> >>>>>> [root at storage01 ~]# gluster volume info >>>>>> >>>>>> Volume Name: storage >>>>>> Type: Distributed-Replicate >>>>>> Volume ID: cf94a8f2-324b-40b3-bf72-c3766100ea99 >>>>>> Status: Started >>>>>> Snapshot Count: 0 >>>>>> Number of Bricks: 3 x (2 + 1) = 9 >>>>>> Transport-type: tcp >>>>>> Bricks: >>>>>> Brick1: 10.0.231.91:/data/storage_a/storage >>>>>> Brick2: 10.0.231.92:/data/storage_b/storage >>>>>> Brick3: 10.0.231.93:/data/storage_c/storage (arbiter) >>>>>> Brick4: 10.0.231.92:/data/storage_a/storage >>>>>> Brick5: 10.0.231.93:/data/storage_b/storage >>>>>> Brick6: 10.0.231.91:/data/storage_c/storage (arbiter) >>>>>> Brick7: 10.0.231.93:/data/storage_a/storage >>>>>> Brick8: 10.0.231.91:/data/storage_b/storage >>>>>> Brick9: 10.0.231.92:/data/storage_c/storage (arbiter) >>>>>> Options Reconfigured: >>>>>> changelog.changelog: on >>>>>> geo-replication.ignore-pid-check: on >>>>>> geo-replication.indexing: on >>>>>> network.ping-timeout: 10 >>>>>> features.inode-quota: on >>>>>> features.quota: on >>>>>> nfs.disable: on >>>>>> features.quota-deem-statfs: on >>>>>> storage.fips-mode-rchecksum: on >>>>>> performance.readdir-ahead: on >>>>>> performance.parallel-readdir: on >>>>>> cluster.lookup-optimize: on >>>>>> client.event-threads: 4 >>>>>> server.event-threads: 4 >>>>>> performance.cache-size: 256MB >>>>>> performance.open-behind: off >>>>>> >>>>>> Thanks, >>>>>> ??-Matthew >>>>>> ________ >>>>>> >>>>>> >>>>>> >>>>>> Community Meeting Calendar: >>>>>> >>>>>> Schedule - >>>>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>>>>> Bridge: https://bluejeans.com/441850968 >>>>>> >>>>>> Gluster-users mailing list >>>>>> Gluster-users at gluster.org >>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>>>> ________ >>>>> >>>>> >>>>> >>>>> Community Meeting Calendar: >>>>> >>>>> Schedule - >>>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>>>> Bridge: https://bluejeans.com/441850968 >>>>> >>>>> Gluster-users mailing list >>>>> Gluster-users at gluster.org >>>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>>> >>>> >>>> ________ >>>> >>>> >>>> >>>> Community Meeting Calendar: >>>> >>>> Schedule - >>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>>> Bridge: https://bluejeans.com/441850968 >>>> >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>> >>> ________ >>> >>> >>> >>> Community Meeting Calendar: >>> >>> Schedule - >>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>> Bridge: https://bluejeans.com/441850968 >>> >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20201016/8ca228c9/attachment.html>
Strahil Nikolov
2020-Oct-19 04:01 UTC
[Gluster-users] Gluster7 GeoReplication Operation not permitted and incomplete sync
>[2020-10-16 20:30:25.039659] E [MSGID: 109009] [dht-helper.c:1384:dht_migration_complete_check_task] 0-pcic-backup-dht: 24bf0575-6ab0-4613-b42a-3b63b3c00165: gfid different on the target file on pcic-backup-readdir-ahead-0 >[2020-10-16 20:30:25.039695] E [MSGID: 148002] [utime.c:146:gf_utime_set_mdata_setxattr_cbk] 0-pcic-backup-utime: dict set of key for set-ctime-mdata failed [Input/output error]I would start by finding that gfid on the source and then identify the gfid of the file in the geo-rep volume. Also, it seems that you got some acl/extended attributes issues on the geo-rep destination - so take a look. Best Regards, Strahil Nikolov