Anant Saraswat
2024-Feb-08 12:42 UTC
[Gluster-users] __Geo-replication status is getting Faulty after few seconds
Hi Everyone, As I was getting "OSError: [Errno 107] Transport endpoint is not connected: '.gfid/d53fad8f-84e9-4b24-9eb0-ccbcbdc4baa8' " error in the primary master node gsyncd log, So I started searching this file details and I found this file in the brick, under the .glusterfs folder on master1 node. Path on master1 - /opt/tier1data2019/brick/.glusterfs/d5/3f/d53fad8f-84e9-4b24-9eb0-ccbcbdc4baa8 [root at master1 ~]# ls -lrt /opt/tier1data2019/brick/.glusterfs/d5/3f/ -rw-r--r-- 2 root root 15996 Dec 14 10:10 d53feba6-dc8b-4645-a86c-befabd0e5069 -rw-r--r-- 2 root root 343111 Dec 18 10:55 d53fed32-b47a-48bf-889e-140c69b04479 -rw-r--r-- 2 root root 5060531 Dec 29 15:29 d53f184d-91e8-4bc1-b6e7-bb5f27ef8b41 -rw-r--r-- 2 root root 2149782 Jan 12 13:25 d53ffee5-fa66-4493-8bdf-f2093b3f6ce7 -rw-r--r-- 2 root root 1913460 Jan 18 10:40 d53f799b-0e87-4800-a3cd-fac9e1a30b54 -rw-r--r-- 2 root root 62940 Jan 22 09:35 d53fb9d4-8c64-4a83-b968-bbbfb9af4224 -rw-r--r-- 1 root root 174592 Jan 22 15:06 d53fad8f-84e9-4b24-9eb0-ccbcbdc4baa8 -rw-r--r-- 2 root root 5633 Jan 26 08:36 d53f6bf6-9aac-476c-b8c5-0569fc8d5116 -rw-r--r-- 2 root root 801740 Feb 8 11:40 d53f71f8-e88b-4ece-b66e-228c2b08d6c8 Now I have noticed two things: First, this file is only present on the primary master node (master1) and doesn't exist on master2 and master3 nodes. Second, this file has different file attributes than other files in the folder. If you check the second column of the above output, every file has "2", but this file has "1". Now, can someone please guide me why this file has "1" and what I should do next? Is it safe to copy this file to the remaining two master nodes, or should I delete it from master1? Many thanks, Anant ________________________________ From: Gluster-users <gluster-users-bounces at gluster.org> on behalf of Anant Saraswat <anant.saraswat at techblue.co.uk> Sent: 08 February 2024 12:01 AM To: Aravinda <aravinda at kadalu.tech> Cc: gluster-users at gluster.org <gluster-users at gluster.org> Subject: Re: [Gluster-users] __Geo-replication status is getting Faulty after few seconds EXTERNAL: Do not click links or open attachments if you do not recognize the sender. Hi @Aravinda<mailto:aravinda at kadalu.tech>, I have checked the rsync version, and it's the same on primary and secondary nodes. We have rsync version 3.1.3, protocol version 31, on all servers. It's very strange that we have not made any changes, that we are aware of, and this geo-replication was working fine for the last 5 years, and suddenly it has stopped, and we are unable to understand the root cause of it. I have checked the tcpdump and I can see that the master node is sending RST to the secondary node when geo-replication connects, but we are not seeing any RST when we do the ssh using the root user from master to secondary node ourselves, which makes me think that geo-replication is able to connect to the secondary node but after that, it's not liking something and tries to reset the connection, and this is repeating in a loop. I have also enabled geo-replication debug logs and I am getting this error in the master node gsyncd logs. [2024-02-07 22:37:36.820978] D [repce(worker /opt/tier1data2019/brick):195:push] RepceClient: call 2563661:140414778891136:1707345456.8209238 entry_ops([{'op': 'CREATE', 'skip_entry': False, 'gfid': '3d57e1e4-7bd2-44f6-a6d1-d628208b3697', 'entry': '.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/app_docmerge8795785720233840105.docx', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'UNLINK', 'skip_entry': False, 'gfid': '3d57e1e4-7bd2-44f6-a6d1-d628208b3697', 'entry': '.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/app_docmerge8795785720233840105.docx'}, {'op': 'CREATE', 'skip_entry': False, 'gfid': '7bd35f91-1408-476d-869a-9936f2d94afc', 'entry': '.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/0c3fb22f-0fbe-4445-845b-9d94d84a9888', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 'gfid': '3837018c-2f5e-43d4-ab58-0ed8b7456e73', 'entry': '.gfid/861afb81-386a-4b5b-af37-cef63a55a436/26fcd7e7-2c8c-4dcb-96f2-2c8a0d79f3d4', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 'gfid': 'db311b10-b1e2-4b84-adea-a6746214aeda', 'entry': '.gfid/861afb81-386a-4b5b-af37-cef63a55a436/0526d0da-1f36-4203-8563-7e23aacf6237', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 'gfid': '9bbb253a-226a-44b1-a968-7cfa76cf9463', 'entry': '.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/app_docmergeLLRenewalLetterDocusign_1_22_15_1_18_153.doc', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'UNLINK', 'skip_entry': False, 'gfid': '9bbb253a-226a-44b1-a968-7cfa76cf9463', 'entry': '.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/app_docmergeLLRenewalLetterDocusign_1_22_15_1_18_153.doc'}, {'op': 'CREATE', 'skip_entry': False, 'gfid': 'f62d0c65-6ede-48ff-b9bf-c44a33e5e023', 'entry': '.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/85530794-c15f-44d4-8660-87a14c2c9c8c', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 'gfid': 'fd3d0af6-8ef5-4b76-bb47-0bc508df0ed0', 'entry': '.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/app_docmergeMOA_1_22_15_1_20_501.doc', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'UNLINK', 'skip_entry': False, 'gfid': 'fd3d0af6-8ef5-4b76-bb47-0bc508df0ed0', 'entry': '.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/app_docmergeMOA_1_22_15_1_20_501.doc'}, {'op': 'CREATE', 'skip_entry': False, 'gfid': 'e93c5771-9676-40d4-90cd-f0586ec05dd9', 'entry': '.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/cc372667-3b77-468f-bac6-671d4eb069e9', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 'gfid': '02045f44-68ff-4a35-a843-08939afc46a4', 'entry': '.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/app_docmergeTTRenewalLetterASTNoFee-2022_1_22_15_1_19_530.doc', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'UNLINK', 'skip_entry': False, 'gfid': '02045f44-68ff-4a35-a843-08939afc46a4', 'entry': '.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/app_docmergeTTRenewalLetterASTNoFee-2022_1_22_15_1_19_530.doc'}, {'op': 'CREATE', 'skip_entry': False, 'gfid': '6f5766c9-2dc3-4636-9041-9cf4ac64d26b', 'entry': '.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/556a0e3c-510d-4396-8f32-335aafec1314', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'UNLINK', 'skip_entry': False, 'gfid': 'f78561f0-c9f2-4192-a82a-8368e0ad8b2b', 'entry': '.gfid/ec161c2e-bb32-4639-a7b2-9be961221d86/app_1705935977525.tmp'}, {'op': 'CREATE', 'skip_entry': False, 'gfid': 'd1e33edb-523e-41c1-a021-8bd3a5a2c7c0', 'entry': '.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/c655e3e5-9d4c-43d7-9171-949f01612e6d', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 'gfid': 'b6f44b28-c2bf-4e70-b953-1c559ded7835', 'entry': '.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/app_docmerge7370453767656401681.docx', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'UNLINK', 'skip_entry': False, 'gfid': 'b6f44b28-c2bf-4e70-b953-1c559ded7835', 'entry': '.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/app_docmerge7370453767656401681.docx'}, {'op': 'CREATE', 'skip_entry': False, 'gfid': '2d845d9e-7a49-4200-a100-759fe831ba0e', 'entry': '.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/84d47d84-5749-4a19-8f73-293078d17c63', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 'gfid': '44554c17-21aa-427a-b796-7ecec6af2570', 'entry': '.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/app_docmerge8634804987715893755.docx', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 'gfid': '652bf5d7-3b7a-41d8-aa4f-e52296034821', 'entry': '.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/91a25682-69ea-4edc-9250-d6c7aac56853', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'UNLINK', 'skip_entry': False, 'gfid': '44554c17-21aa-427a-b796-7ecec6af2570', 'entry': '.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/app_docmerge8634804987715893755.docx'}, {'op': 'CREATE', 'skip_entry': False, 'gfid': '04720811-b90e-42b7-a5d1-656afd92e245', 'entry': '.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/a66cbc42-61dc-4896-bb69-c715f1a820db', 'mode': 33188, 'uid': 0, 'gid': 0}],) ... [2024-02-07 22:37:36.909606] D [repce(worker /opt/tier1data2019/brick):215:__call__] RepceClient: call 2563661:140414778891136:1707345456.8209238 entry_ops -> [] [2024-02-07 22:37:36.911032] D [master(worker /opt/tier1data2019/brick):317:a_syncdata] _GMaster: files [{files={'.gfid/652bf5d7-3b7a-41d8-aa4f-e52296034821', '.gfid/2d845d9e-7a49-4200-a100-759fe831ba0e', '.gfid/3837018c-2f5e-43d4-ab58-0ed8b7456e73', '.gfid/e93c5771-9676-40d4-90cd-f0586ec05dd9', '.gfid/f62d0c65-6ede-48ff-b9bf-c44a33e5e023', '.gfid/7bd35f91-1408-476d-869a-9936f2d94afc', '.gfid/04720811-b90e-42b7-a5d1-656afd92e245', '.gfid/6f5766c9-2dc3-4636-9041-9cf4ac64d26b', '.gfid/db311b10-b1e2-4b84-adea-a6746214aeda', '.gfid/d1e33edb-523e-41c1-a021-8bd3a5a2c7c0'}}] [2024-02-07 22:37:36.911089] D [master(worker /opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for syncing [{file=.gfid/652bf5d7-3b7a-41d8-aa4f-e52296034821}] [2024-02-07 22:37:36.911133] D [master(worker /opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for syncing [{file=.gfid/2d845d9e-7a49-4200-a100-759fe831ba0e}] [2024-02-07 22:37:36.911169] D [master(worker /opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for syncing [{file=.gfid/3837018c-2f5e-43d4-ab58-0ed8b7456e73}] [2024-02-07 22:37:36.911202] D [master(worker /opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for syncing [{file=.gfid/e93c5771-9676-40d4-90cd-f0586ec05dd9}] [2024-02-07 22:37:36.911235] D [master(worker /opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for syncing [{file=.gfid/f62d0c65-6ede-48ff-b9bf-c44a33e5e023}] [2024-02-07 22:37:36.911268] D [master(worker /opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for syncing [{file=.gfid/7bd35f91-1408-476d-869a-9936f2d94afc}] [2024-02-07 22:37:36.911301] D [master(worker /opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for syncing [{file=.gfid/04720811-b90e-42b7-a5d1-656afd92e245}] [2024-02-07 22:37:36.911333] D [master(worker /opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for syncing [{file=.gfid/6f5766c9-2dc3-4636-9041-9cf4ac64d26b}] [2024-02-07 22:37:36.911366] D [master(worker /opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for syncing [{file=.gfid/db311b10-b1e2-4b84-adea-a6746214aeda}] [2024-02-07 22:37:36.911398] D [master(worker /opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for syncing [{file=.gfid/d1e33edb-523e-41c1-a021-8bd3a5a2c7c0}] [2024-02-07 22:37:36.911439] D [master(worker /opt/tier1data2019/brick):1344:process] _GMaster: processing change [{changelog=/var/lib/misc/gluster/gsyncd/tier1data_drtier1data_drtier1data/opt-tier1data2019-brick/.history/.processing/CHANGELOG.1705936007}] [2024-02-07 22:37:36.915193] E [syncdutils(worker /opt/tier1data2019/brick):346:log_raise_exception] <top>: Gluster Mount process exited [{error=ENOTCONN}] [2024-02-07 22:37:36.915252] E [syncdutils(worker /opt/tier1data2019/brick):363:log_raise_exception] <top>: FULL EXCEPTION TRACE: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 317, in main func(args) File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 86, in subcmd_worker local.service_loop(remote) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1298, in service_loop g3.crawlwrap(oneshot=True) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 604, in crawlwrap self.crawl() File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1614, in crawl self.changelogs_batch_process(changes) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1510, in changelogs_batch_process self.process(batch) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1345, in process self.process_change(change, done, retry) File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1071, in process_change st = lstat(pt) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 589, in lstat return errno_wrap(os.lstat, [e], [ENOENT], [ESTALE, EBUSY]) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 571, in errno_wrap return call(*arg) OSError: [Errno 107] Transport endpoint is not connected: '.gfid/d53fad8f-84e9-4b24-9eb0-ccbcbdc4baa8' [2024-02-07 22:37:37.344426] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/opt/tier1data2019/brick}] [2024-02-07 22:37:37.346601] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] Thanks, Anant ________________________________ From: Aravinda <aravinda at kadalu.tech> Sent: 07 February 2024 2:54 PM To: Anant Saraswat <anant.saraswat at techblue.co.uk> Cc: Strahil Nikolov <hunter86_bg at yahoo.com>; gluster-users at gluster.org <gluster-users at gluster.org> Subject: Re: [Gluster-users] __Geo-replication status is getting Faulty after few seconds EXTERNAL: Do not click links or open attachments if you do not recognize the sender. It will keep track of last sync time if you change to non-root user. But I don't think the issue is related to root vs non-root user. Even in non-root user based Geo-rep, Primary volume is mounted using root user only. Only in the secondary node, it will use Glusterd mountbroker to allow mounting the Secondary volume as non-priviliaged user. Check the rsync version in Primary and secondary nodes. Please fix the versions if not matching. -- Aravinda Kadalu Technologies ---- On Wed, 07 Feb 2024 20:11:47 +0530 Anant Saraswat <anant.saraswat at techblue.co.uk> wrote --- No, It was setup and running using the root user only. Do you think I should setup using a dedicated non-root user? will it keep the track of old files or will it consider it as a new geo-replication and copy all the files from the scratch? ________________________________ From: Strahil Nikolov <hunter86_bg at yahoo.com<mailto:hunter86_bg at yahoo.com>> Sent: 07 February 2024 2:36 PM To: Anant Saraswat <anant.saraswat at techblue.co.uk<mailto:anant.saraswat at techblue.co.uk>>; Aravinda <aravinda at kadalu.tech<mailto:aravinda at kadalu.tech>> Cc: gluster-users at gluster.org<mailto:gluster-users at gluster.org> <gluster-users at gluster.org<mailto:gluster-users at gluster.org>> Subject: Re: [Gluster-users] __Geo-replication status is getting Faulty after few seconds EXTERNAL: Do not click links or open attachments if you do not recognize the sender. Have you tried setting up gluster georep with a dedicated non-root user ? Best Regards, Strahil Nikolov On Tue, Feb 6, 2024 at 16:38, Anant Saraswat <anant.saraswat at techblue.co.uk<mailto:anant.saraswat at techblue.co.uk>> wrote: ________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk<https://urldefense.com/v3/__https://meet.google.com/cpu-eiue-hvk__;!!I_DbfM1H!Dm8_fHcUmz5wnOfTdrkMSb6PXqGdC_3VpklsIdfjPuKgee_Ds7JD__1KjwR4F62a67f5292of5PyQVk9y3-TRe_00eSiJw$> Gluster-users mailing list Gluster-users at gluster.org<mailto:Gluster-users at gluster.org> https://lists.gluster.org/mailman/listinfo/gluster-users<https://urldefense.com/v3/__https://lists.gluster.org/mailman/listinfo/gluster-users__;!!I_DbfM1H!Dm8_fHcUmz5wnOfTdrkMSb6PXqGdC_3VpklsIdfjPuKgee_Ds7JD__1KjwR4F62a67f5292of5PyQVk9y3-TRe-GwoljEQ$> DISCLAIMER: This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error, please notify the sender. This message contains confidential information and is intended only for the individual named. If you are not the named addressee, you should not disseminate, distribute or copy this email. Please notify the sender immediately by email if you have received this email by mistake and delete this email from your system. If you are not the intended recipient, you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. Thanks for your cooperation. DISCLAIMER: This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error, please notify the sender. This message contains confidential information and is intended only for the individual named. If you are not the named addressee, you should not disseminate, distribute or copy this email. Please notify the sender immediately by email if you have received this email by mistake and delete this email from your system. If you are not the intended recipient, you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. Thanks for your cooperation. DISCLAIMER: This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error, please notify the sender. This message contains confidential information and is intended only for the individual named. If you are not the named addressee, you should not disseminate, distribute or copy this email. Please notify the sender immediately by email if you have received this email by mistake and delete this email from your system. If you are not the intended recipient, you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. Thanks for your cooperation. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20240208/4fba354b/attachment.html>
Diego Zuccato
2024-Feb-08 13:37 UTC
[Gluster-users] __Geo-replication status is getting Faulty after few seconds
That '1' means there's no corresponding file in the regular file structure (outside .glusterfs). IIUC it shouldn't happen, but it does (quite often). *Probably* it's safe to just delete it, but wait for advice from more competent users. Diego Il 08/02/2024 13:42, Anant Saraswat ha scritto:> Hi Everyone, > > As I was getting "OSError: [Errno 107] Transport endpoint is not > connected: '.gfid/d53fad8f-84e9-4b24-9eb0-ccbcbdc4baa8' " error in the > primary master node gsyncd log, So I started searching this file details > and I found this file in the brick, under the .glusterfs folder on > master1 node. > > Path on master1 - > /opt/tier1data2019/brick/.glusterfs/d5/3f/d53fad8f-84e9-4b24-9eb0-ccbcbdc4baa8 > > [root at master1 ~]# ls -lrt /opt/tier1data2019/brick/.glusterfs/d5/3f/ > > -rw-r--r-- ?2 root ? ?root ? ? ? 15996 Dec 14 10:10 > d53feba6-dc8b-4645-a86c-befabd0e5069 > -rw-r--r-- ?2 root ? ?root ? ? ?343111 Dec 18 10:55 > d53fed32-b47a-48bf-889e-140c69b04479 > -rw-r--r-- ?2 root ? ?root ? ? 5060531 Dec 29 15:29 > d53f184d-91e8-4bc1-b6e7-bb5f27ef8b41 > -rw-r--r-- ?2 root ? ?root ? ? 2149782 Jan 12 13:25 > d53ffee5-fa66-4493-8bdf-f2093b3f6ce7 > -rw-r--r-- ?2 root ? ?root ? ? 1913460 Jan 18 10:40 > d53f799b-0e87-4800-a3cd-fac9e1a30b54 > -rw-r--r-- ?2 root ? ?root ? ? ? 62940 Jan 22 09:35 > d53fb9d4-8c64-4a83-b968-bbbfb9af4224 > -rw-r--r-- ?1 root ? ?root ? ? ?174592 Jan 22 15:06 > d53fad8f-84e9-4b24-9eb0-ccbcbdc4baa8 > -rw-r--r-- ?2 root ? ?root ? ? ? ?5633 Jan 26 08:36 > d53f6bf6-9aac-476c-b8c5-0569fc8d5116 > -rw-r--r-- ?2 root ? ?root ? ? ?801740 Feb ?8 11:40 > d53f71f8-e88b-4ece-b66e-228c2b08d6c8 > > Now I have noticed two things: > > First, this file is only present on the primary master node (master1) > and doesn't exist on master2 and master3 nodes. > > Second, this file has different file attributes than other files in the > folder. If you check the second column of the above output, every file > has "2", but this file has "1". > > Now, can someone please guide me why this file has "1" and what I should > do next? Is it safe to copy this file to the remaining two master nodes, > or should I delete it from master1? > > Many thanks, > Anant > > ------------------------------------------------------------------------ > *From:*?Gluster-users <gluster-users-bounces at gluster.org> on behalf of > Anant Saraswat <anant.saraswat at techblue.co.uk> > *Sent:*?08 February 2024 12:01 AM > *To:*?Aravinda <aravinda at kadalu.tech> > *Cc:*?gluster-users at gluster.org <gluster-users at gluster.org> > *Subject:*?Re: [Gluster-users] __Geo-replication status is getting > Faulty after few????seconds > > *EXTERNAL:?Do not click links or open attachments if you do not > recognize the sender.* > > Hi @Aravinda <mailto:aravinda at kadalu.tech>, > > I have checked the rsync version, and it's the same on primary and > secondary nodes. We have rsync version 3.1.3, protocol version 31, on > all servers. It's very strange that we have not made any changes, that > we are aware of, and this geo-replication was working fine for the last > 5 years, and suddenly it has stopped, and we are unable to understand > the root cause of it. > > > I have checked the tcpdump and I can see that the master node is sending > RST to the secondary node when geo-replication connects, but we are not > seeing any RST when we do the ssh using the root user from master to > secondary node ourselves, which makes me think that geo-replication is > able to connect to the secondary node but after that, it's not liking > something and tries to reset the connection, and this is repeating in a > loop. > > > I have also enabled geo-replication debug logs and I am getting this > error in the master node gsyncd logs. > > > [2024-02-07 22:37:36.820978] D [repce(worker > /opt/tier1data2019/brick):195:push] RepceClient: call > 2563661:140414778891136:1707345456.8209238 entry_ops([{'op': 'CREATE', > 'skip_entry': False, 'gfid': '3d57e1e4-7bd2-44f6-a6d1-d628208b3697', > 'entry': > '.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/app_docmerge8795785720233840105.docx', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'UNLINK', 'skip_entry': False, 'gfid': '3d57e1e4-7bd2-44f6-a6d1-d628208b3697', 'entry': '.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/app_docmerge8795785720233840105.docx'}, {'op': 'CREATE', 'skip_entry': False, 'gfid': '7bd35f91-1408-476d-869a-9936f2d94afc', 'entry': '.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/0c3fb22f-0fbe-4445-845b-9d94d84a9888', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 'gfid': '3837018c-2f5e-43d4-ab58-0ed8b7456e73', 'entry': '.gfid/861afb81-386a-4b5b-af37-cef63a55a436/26fcd7e7-2c8c-4dcb-96f2-2c8a0d79f3d4', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 'gfid': 'db311b10-b1e2-4b84-adea-a6746214aeda', 'entry': '.gfid/861afb81-386a-4b5b-af37-cef63a55a436/0526d0da-1f36-4203-8563-7e23aacf6237', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 'gfid': '9bbb253a-226a-44b1-a968-7cfa76cf9463', 'entry': '.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/app_docmergeLLRenewalLetterDocusign_1_22_15_1_18_153.doc', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'UNLINK', 'skip_entry': False, 'gfid': '9bbb253a-226a-44b1-a968-7cfa76cf9463', 'entry': '.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/app_docmergeLLRenewalLetterDocusign_1_22_15_1_18_153.doc'}, {'op': 'CREATE', 'skip_entry': False, 'gfid': 'f62d0c65-6ede-48ff-b9bf-c44a33e5e023', 'entry': '.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/85530794-c15f-44d4-8660-87a14c2c9c8c', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 'gfid': 'fd3d0af6-8ef5-4b76-bb47-0bc508df0ed0', 'entry': '.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/app_docmergeMOA_1_22_15_1_20_501.doc', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'UNLINK', 'skip_entry': False, 'gfid': 'fd3d0af6-8ef5-4b76-bb47-0bc508df0ed0', 'entry': '.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/app_docmergeMOA_1_22_15_1_20_501.doc'}, {'op': 'CREATE', 'skip_entry': False, 'gfid': 'e93c5771-9676-40d4-90cd-f0586ec05dd9', 'entry': '.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/cc372667-3b77-468f-bac6-671d4eb069e9', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 'gfid': '02045f44-68ff-4a35-a843-08939afc46a4', 'entry': '.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/app_docmergeTTRenewalLetterASTNoFee-2022_1_22_15_1_19_530.doc', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'UNLINK', 'skip_entry': False, 'gfid': '02045f44-68ff-4a35-a843-08939afc46a4', 'entry': '.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/app_docmergeTTRenewalLetterASTNoFee-2022_1_22_15_1_19_530.doc'}, {'op': 'CREATE', 'skip_entry': False, 'gfid': '6f5766c9-2dc3-4636-9041-9cf4ac64d26b', 'entry': '.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/556a0e3c-510d-4396-8f32-335aafec1314', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'UNLINK', 'skip_entry': False, 'gfid': 'f78561f0-c9f2-4192-a82a-8368e0ad8b2b', 'entry': '.gfid/ec161c2e-bb32-4639-a7b2-9be961221d86/app_1705935977525.tmp'}, {'op': 'CREATE', 'skip_entry': False, 'gfid': 'd1e33edb-523e-41c1-a021-8bd3a5a2c7c0', 'entry': '.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/c655e3e5-9d4c-43d7-9171-949f01612e6d', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 'gfid': 'b6f44b28-c2bf-4e70-b953-1c559ded7835', 'entry': '.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/app_docmerge7370453767656401681.docx', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'UNLINK', 'skip_entry': False, 'gfid': 'b6f44b28-c2bf-4e70-b953-1c559ded7835', 'entry': '.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/app_docmerge7370453767656401681.docx'}, {'op': 'CREATE', 'skip_entry': False, 'gfid': '2d845d9e-7a49-4200-a100-759fe831ba0e', 'entry': '.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/84d47d84-5749-4a19-8f73-293078d17c63', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 'gfid': '44554c17-21aa-427a-b796-7ecec6af2570', 'entry': '.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/app_docmerge8634804987715893755.docx', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 'gfid': '652bf5d7-3b7a-41d8-aa4f-e52296034821', 'entry': '.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/91a25682-69ea-4edc-9250-d6c7aac56853', 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'UNLINK', 'skip_entry': False, 'gfid': '44554c17-21aa-427a-b796-7ecec6af2570', 'entry': '.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/app_docmerge8634804987715893755.docx'}, {'op': 'CREATE', 'skip_entry': False, 'gfid': '04720811-b90e-42b7-a5d1-656afd92e245', 'entry': '.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/a66cbc42-61dc-4896-bb69-c715f1a820db', 'mode': 33188, 'uid': 0, 'gid': 0}],) ... > > [2024-02-07 22:37:36.909606] D [repce(worker > /opt/tier1data2019/brick):215:__call__] RepceClient: call > 2563661:140414778891136:1707345456.8209238 entry_ops -> [] > [2024-02-07 22:37:36.911032] D [master(worker > /opt/tier1data2019/brick):317:a_syncdata] _GMaster: files > [{files={'.gfid/652bf5d7-3b7a-41d8-aa4f-e52296034821', > '.gfid/2d845d9e-7a49-4200-a100-759fe831ba0e', > '.gfid/3837018c-2f5e-43d4-ab58-0ed8b7456e73', > '.gfid/e93c5771-9676-40d4-90cd-f0586ec05dd9', > '.gfid/f62d0c65-6ede-48ff-b9bf-c44a33e5e023', > '.gfid/7bd35f91-1408-476d-869a-9936f2d94afc', > '.gfid/04720811-b90e-42b7-a5d1-656afd92e245', > '.gfid/6f5766c9-2dc3-4636-9041-9cf4ac64d26b', > '.gfid/db311b10-b1e2-4b84-adea-a6746214aeda', > '.gfid/d1e33edb-523e-41c1-a021-8bd3a5a2c7c0'}}] > [2024-02-07 22:37:36.911089] D [master(worker > /opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for > syncing [{file=.gfid/652bf5d7-3b7a-41d8-aa4f-e52296034821}] > [2024-02-07 22:37:36.911133] D [master(worker > /opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for > syncing [{file=.gfid/2d845d9e-7a49-4200-a100-759fe831ba0e}] > [2024-02-07 22:37:36.911169] D [master(worker > /opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for > syncing [{file=.gfid/3837018c-2f5e-43d4-ab58-0ed8b7456e73}] > [2024-02-07 22:37:36.911202] D [master(worker > /opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for > syncing [{file=.gfid/e93c5771-9676-40d4-90cd-f0586ec05dd9}] > [2024-02-07 22:37:36.911235] D [master(worker > /opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for > syncing [{file=.gfid/f62d0c65-6ede-48ff-b9bf-c44a33e5e023}] > [2024-02-07 22:37:36.911268] D [master(worker > /opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for > syncing [{file=.gfid/7bd35f91-1408-476d-869a-9936f2d94afc}] > [2024-02-07 22:37:36.911301] D [master(worker > /opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for > syncing [{file=.gfid/04720811-b90e-42b7-a5d1-656afd92e245}] > [2024-02-07 22:37:36.911333] D [master(worker > /opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for > syncing [{file=.gfid/6f5766c9-2dc3-4636-9041-9cf4ac64d26b}] > [2024-02-07 22:37:36.911366] D [master(worker > /opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for > syncing [{file=.gfid/db311b10-b1e2-4b84-adea-a6746214aeda}] > [2024-02-07 22:37:36.911398] D [master(worker > /opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for > syncing [{file=.gfid/d1e33edb-523e-41c1-a021-8bd3a5a2c7c0}] > [2024-02-07 22:37:36.911439] D [master(worker > /opt/tier1data2019/brick):1344:process] _GMaster: processing change > [{changelog=/var/lib/misc/gluster/gsyncd/tier1data_drtier1data_drtier1data/opt-tier1data2019-brick/.history/.processing/CHANGELOG.1705936007}] > [2024-02-07 22:37:36.915193] E [syncdutils(worker > /opt/tier1data2019/brick):346:log_raise_exception] <top>: Gluster Mount > process exited [{error=ENOTCONN}] > [2024-02-07 22:37:36.915252] E [syncdutils(worker > /opt/tier1data2019/brick):363:log_raise_exception] <top>: FULL EXCEPTION > TRACE: > Traceback (most recent call last): > ? File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 317, > in main > ? ? func(args) > ? File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 86, > in subcmd_worker > ? ? local.service_loop(remote) > ? File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line > 1298, in service_loop > ? ? g3.crawlwrap(oneshot=True) > ? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 604, > in crawlwrap > ? ? self.crawl() > ? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1614, > in crawl > ? ? self.changelogs_batch_process(changes) > ? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1510, > in changelogs_batch_process > ? ? self.process(batch) > ? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1345, > in process > ? ? self.process_change(change, done, retry) > ? File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1071, > in process_change > ? ? st = lstat(pt) > ? File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line > 589, in lstat > ? ? return errno_wrap(os.lstat, [e], [ENOENT], [ESTALE, EBUSY]) > ? File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line > 571, in errno_wrap > ? ? return call(*arg) > OSError: [Errno 107] Transport endpoint is not connected: > '.gfid/d53fad8f-84e9-4b24-9eb0-ccbcbdc4baa8' > [2024-02-07 22:37:37.344426] I [monitor(monitor):228:monitor] Monitor: > worker died in startup phase [{brick=/opt/tier1data2019/brick}] > [2024-02-07 22:37:37.346601] I > [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker > Status Change [{status=Faulty}] > > > Thanks, > Anant > > ------------------------------------------------------------------------ > *From:*?Aravinda <aravinda at kadalu.tech> > *Sent:*?07 February 2024 2:54 PM > *To:*?Anant Saraswat <anant.saraswat at techblue.co.uk> > *Cc:*?Strahil Nikolov <hunter86_bg at yahoo.com>; gluster-users at gluster.org > <gluster-users at gluster.org> > *Subject:*?Re: [Gluster-users] __Geo-replication status is getting > Faulty after few????seconds > > *EXTERNAL:?Do not click links or open attachments if you do not > recognize the sender.* > > It will keep track of last sync time if you change to non-root user. But > I don't think the issue is related to root vs non-root user. > > Even in non-root user based Geo-rep, Primary volume is mounted using > root user only. Only in the secondary node, it will use Glusterd > mountbroker to allow mounting the Secondary volume as non-priviliaged user. > > Check the rsync version in Primary and secondary nodes. Please fix the > versions if not matching. > > -- > Aravinda > Kadalu Technologies > > > > ---- On Wed, 07 Feb 2024 20:11:47 +0530 *Anant Saraswat > <anant.saraswat at techblue.co.uk>*?wrote --- > > No, It was setup and running using the root user only. > > Do you think I should setup using ?a dedicated non-root user? will it > keep the track of old files or will it consider it as a new > geo-replication and copy all the files from the scratch? > > ------------------------------------------------------------------------ > *From:*?Strahil Nikolov <hunter86_bg at yahoo.com > <mailto:hunter86_bg at yahoo.com>> > *Sent:*?07 February 2024 2:36 PM > *To:*?Anant Saraswat <anant.saraswat at techblue.co.uk > <mailto:anant.saraswat at techblue.co.uk>>; Aravinda <aravinda at kadalu.tech > <mailto:aravinda at kadalu.tech>> > *Cc:* gluster-users at gluster.org > <mailto:gluster-users at gluster.org>?<gluster-users at gluster.org > <mailto:gluster-users at gluster.org>> > *Subject:*?Re: [Gluster-users] __Geo-replication status is getting > Faulty after few????seconds > > *EXTERNAL:?Do not click links or open attachments if you do not > recognize the sender.* > > Have you tried setting up gluster georep with a dedicated non-root user ? > > Best Regards, > Strahil Nikolov > > On Tue, Feb 6, 2024 at 16:38, Anant Saraswat > <anant.saraswat at techblue.co.uk > <mailto:anant.saraswat at techblue.co.uk>> wrote: > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://meet.google.com/cpu-eiue-hvk > <https://urldefense.com/v3/__https://meet.google.com/cpu-eiue-hvk__;!!I_DbfM1H!Dm8_fHcUmz5wnOfTdrkMSb6PXqGdC_3VpklsIdfjPuKgee_Ds7JD__1KjwR4F62a67f5292of5PyQVk9y3-TRe_00eSiJw$> > Gluster-users mailing list > Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> > https://lists.gluster.org/mailman/listinfo/gluster-users > <https://urldefense.com/v3/__https://lists.gluster.org/mailman/listinfo/gluster-users__;!!I_DbfM1H!Dm8_fHcUmz5wnOfTdrkMSb6PXqGdC_3VpklsIdfjPuKgee_Ds7JD__1KjwR4F62a67f5292of5PyQVk9y3-TRe-GwoljEQ$> > > > DISCLAIMER: This email and any files transmitted with it are > confidential and intended solely for the use of the individual or entity > to whom they are addressed. If you have received this email in error, > please notify the sender. This message contains confidential information > and is intended only for the individual named. If you are not the named > addressee, you should not disseminate, distribute or copy this email. > Please notify the sender immediately by email if you have received this > email by mistake and delete this email from your system. > > If you are not the intended recipient, you are notified that disclosing, > copying, distributing or taking any action in reliance on the contents > of this information is strictly prohibited. Thanks for your cooperation. > > > > DISCLAIMER: This email and any files transmitted with it are > confidential and intended solely for the use of the individual or entity > to whom they are addressed. If you have received this email in error, > please notify the sender. This message contains confidential information > and is intended only for the individual named. If you are not the named > addressee, you should not disseminate, distribute or copy this email. > Please notify the sender immediately by email if you have received this > email by mistake and delete this email from your system. > > If you are not the intended recipient, you are notified that disclosing, > copying, distributing or taking any action in reliance on the contents > of this information is strictly prohibited. Thanks for your cooperation. > > DISCLAIMER: This email and any files transmitted with it are > confidential and intended solely for the use of the individual or entity > to whom they are addressed. If you have received this email in error, > please notify the sender. This message contains confidential information > and is intended only for the individual named. If you are not the named > addressee, you should not disseminate, distribute or copy this email. > Please notify the sender immediately by email if you have received this > email by mistake and delete this email from your system. > > If you are not the intended recipient, you are notified that disclosing, > copying, distributing or taking any action in reliance on the contents > of this information is strictly prohibited. Thanks for your cooperation. > > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://meet.google.com/cpu-eiue-hvk > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users-- Diego Zuccato DIFA - Dip. di Fisica e Astronomia Servizi Informatici Alma Mater Studiorum - Universit? di Bologna V.le Berti-Pichat 6/2 - 40127 Bologna - Italy tel.: +39 051 20 95786