Marcus Pedersén
2018-Aug-02 08:04 UTC
[Gluster-users] Geo-replication stops after 4-5 hours
Hi Kotresh, I get the following and then it hangs: strace: Process 5921 attached write(2, "rsync: link_stat \"/tmp/gsyncd-au"..., 12811 When sync is running I can see rsync with geouser on the slave node. Regards Marcus ################ Marcus Peders?n Systemadministrator Interbull Centre ################ Sent from my phone ################ Den 2 aug. 2018 09:31 skrev Kotresh Hiremath Ravishankar <khiremat at redhat.com>: Cool, just check whether they are hung by any chance with following command. #strace -f -p 5921 On Thu, Aug 2, 2018 at 12:25 PM, Marcus Peders?n <marcus.pedersen at slu.se<mailto:marcus.pedersen at slu.se>> wrote: On both active master nodes there is an rsync process. As in: root 5921 0.0 0.0 115424 1176 ? S Aug01 0:00 rsync -aR0 --inplace --files-from=- --super --stats --numeric-ids --no-implied-dirs --xattrs --acls . -e ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-stuphs/bf60c68f1a195dad59573a8dbaa309f2.sock geouser at urd-gds-geo-001:/proc/13077/cwd There is also ssh tunnels to slave nodes and gsyncd.py processes. Regards Marcus ################ Marcus Peders?n Systemadministrator Interbull Centre ################ Sent from my phone ################ Den 2 aug. 2018 08:07 skrev Kotresh Hiremath Ravishankar <khiremat at redhat.com<mailto:khiremat at redhat.com>>: Could you look of any rsync processes hung in master or slave? On Thu, Aug 2, 2018 at 11:18 AM, Marcus Peders?n <marcus.pedersen at slu.se<mailto:marcus.pedersen at slu.se>> wrote: Hi Kortesh, rsync version 3.1.2 protocol version 31 All nodes run CentOS 7, updated the last couple of days. Thanks Marcus ################ Marcus Peders?n Systemadministrator Interbull Centre ################ Sent from my phone ################ Den 2 aug. 2018 06:13 skrev Kotresh Hiremath Ravishankar <khiremat at redhat.com<mailto:khiremat at redhat.com>>: Hi Marcus, What's the rsync version being used? Thanks, Kotresh HR On Thu, Aug 2, 2018 at 1:48 AM, Marcus Peders?n <marcus.pedersen at slu.se<mailto:marcus.pedersen at slu.se>> wrote: Hi all! I upgraded from 3.12.9 to 4.1.1 and had problems with geo-replication. With help from the list with some sym links and so on (handled in another thread) I got the geo-replication running. It ran for 4-5 hours and then stopped, I stopped and started geo-replication and it ran for another 4-5 hours. 4.1.2 was released and I updated, hoping this would solve the problem. I still have the same problem, at start it runs for 4-5 hours and then it stops. After that nothing happens, I have waited for days but still nothing happens. I have looked through logs but can not find anything obvious. Status for geo-replication is active for the two same nodes all the time: MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED ENTRY DATA META FAILURES CHECKPOINT TIME CHECKPOINT COMPLETED CHECKPOINT COMPLETION TIME ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- urd-gds-001 urd-gds-volume /urd-gds/gluster geouser geouser at urd-gds-geo-001::urd-gds-volume urd-gds-geo-000 Active History Crawl 2018-04-16 20:32:09 0 14205 0 0 2018-07-27 21:12:44 No N/A urd-gds-002 urd-gds-volume /urd-gds/gluster geouser geouser at urd-gds-geo-001::urd-gds-volume urd-gds-geo-002 Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A urd-gds-004 urd-gds-volume /urd-gds/gluster geouser geouser at urd-gds-geo-001::urd-gds-volume urd-gds-geo-002 Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A urd-gds-003 urd-gds-volume /urd-gds/gluster geouser geouser at urd-gds-geo-001::urd-gds-volume urd-gds-geo-000 Active History Crawl 2018-05-01 20:58:14 285 4552 0 0 2018-07-27 21:12:44 No N/A urd-gds-000 urd-gds-volume /urd-gds/gluster1 geouser geouser at urd-gds-geo-001::urd-gds-volume urd-gds-geo-001 Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A urd-gds-000 urd-gds-volume /urd-gds/gluster2 geouser geouser at urd-gds-geo-001::urd-gds-volume urd-gds-geo-001 Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A Master cluster is Distribute-Replicate 2 x (2 + 1) Used space 30TB Slave cluster is Replicate 1 x (2 + 1) Used space 9TB Parts from gsyncd.logs are enclosed. Thanks a lot! Best regards Marcus Peders?n --- N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/> E-mailing SLU will result in SLU processing your personal data. For more information on how this is done, click here <https://www.slu.se/en/about-slu/contact-slu/personal-data/> _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org<mailto:Gluster-users at gluster.org> https://lists.gluster.org/mailman/listinfo/gluster-users -- Thanks and Regards, Kotresh H R --- N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/> E-mailing SLU will result in SLU processing your personal data. For more information on how this is done, click here <https://www.slu.se/en/about-slu/contact-slu/personal-data/> -- Thanks and Regards, Kotresh H R --- N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/> E-mailing SLU will result in SLU processing your personal data. For more information on how this is done, click here <https://www.slu.se/en/about-slu/contact-slu/personal-data/> -- Thanks and Regards, Kotresh H R --- N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/> E-mailing SLU will result in SLU processing your personal data. For more information on how this is done, click here <https://www.slu.se/en/about-slu/contact-slu/personal-data/> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180802/984abcaf/attachment.html>
Marcus Pedersén
2018-Aug-06 11:28 UTC
[Gluster-users] Geo-replication stops after 4-5 hours
Hi, Is there a way to resolve the problem with rsync and hanging processes? Do I need to kill all the processes and hope that it starts again or stop/start geo-replication? If I stop/start geo-replication it will start again, I have tried it before. Regards Marcus ________________________________ Fr?n: gluster-users-bounces at gluster.org <gluster-users-bounces at gluster.org> f?r Marcus Peders?n <marcus.pedersen at slu.se> Skickat: den 2 augusti 2018 10:04 Till: Kotresh Hiremath Ravishankar Kopia: gluster-users at gluster.org ?mne: Re: [Gluster-users] Geo-replication stops after 4-5 hours Hi Kotresh, I get the following and then it hangs: strace: Process 5921 attached write(2, "rsync: link_stat \"/tmp/gsyncd-au"..., 12811 When sync is running I can see rsync with geouser on the slave node. Regards Marcus ################ Marcus Peders?n Systemadministrator Interbull Centre ################ Sent from my phone ################ Den 2 aug. 2018 09:31 skrev Kotresh Hiremath Ravishankar <khiremat at redhat.com>: Cool, just check whether they are hung by any chance with following command. #strace -f -p 5921 On Thu, Aug 2, 2018 at 12:25 PM, Marcus Peders?n <marcus.pedersen at slu.se<mailto:marcus.pedersen at slu.se>> wrote: On both active master nodes there is an rsync process. As in: root 5921 0.0 0.0 115424 1176 ? S Aug01 0:00 rsync -aR0 --inplace --files-from=- --super --stats --numeric-ids --no-implied-dirs --xattrs --acls . -e ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-stuphs/bf60c68f1a195dad59573a8dbaa309f2.sock geouser at urd-gds-geo-001:/proc/13077/cwd There is also ssh tunnels to slave nodes and gsyncd.py processes. Regards Marcus ################ Marcus Peders?n Systemadministrator Interbull Centre ################ Sent from my phone ################ Den 2 aug. 2018 08:07 skrev Kotresh Hiremath Ravishankar <khiremat at redhat.com<mailto:khiremat at redhat.com>>: Could you look of any rsync processes hung in master or slave? On Thu, Aug 2, 2018 at 11:18 AM, Marcus Peders?n <marcus.pedersen at slu.se<mailto:marcus.pedersen at slu.se>> wrote: Hi Kortesh, rsync version 3.1.2 protocol version 31 All nodes run CentOS 7, updated the last couple of days. Thanks Marcus ################ Marcus Peders?n Systemadministrator Interbull Centre ################ Sent from my phone ################ Den 2 aug. 2018 06:13 skrev Kotresh Hiremath Ravishankar <khiremat at redhat.com<mailto:khiremat at redhat.com>>: Hi Marcus, What's the rsync version being used? Thanks, Kotresh HR On Thu, Aug 2, 2018 at 1:48 AM, Marcus Peders?n <marcus.pedersen at slu.se<mailto:marcus.pedersen at slu.se>> wrote: Hi all! I upgraded from 3.12.9 to 4.1.1 and had problems with geo-replication. With help from the list with some sym links and so on (handled in another thread) I got the geo-replication running. It ran for 4-5 hours and then stopped, I stopped and started geo-replication and it ran for another 4-5 hours. 4.1.2 was released and I updated, hoping this would solve the problem. I still have the same problem, at start it runs for 4-5 hours and then it stops. After that nothing happens, I have waited for days but still nothing happens. I have looked through logs but can not find anything obvious. Status for geo-replication is active for the two same nodes all the time: MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED ENTRY DATA META FAILURES CHECKPOINT TIME CHECKPOINT COMPLETED CHECKPOINT COMPLETION TIME ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- urd-gds-001 urd-gds-volume /urd-gds/gluster geouser geouser at urd-gds-geo-001::urd-gds-volume urd-gds-geo-000 Active History Crawl 2018-04-16 20:32:09 0 14205 0 0 2018-07-27 21:12:44 No N/A urd-gds-002 urd-gds-volume /urd-gds/gluster geouser geouser at urd-gds-geo-001::urd-gds-volume urd-gds-geo-002 Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A urd-gds-004 urd-gds-volume /urd-gds/gluster geouser geouser at urd-gds-geo-001::urd-gds-volume urd-gds-geo-002 Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A urd-gds-003 urd-gds-volume /urd-gds/gluster geouser geouser at urd-gds-geo-001::urd-gds-volume urd-gds-geo-000 Active History Crawl 2018-05-01 20:58:14 285 4552 0 0 2018-07-27 21:12:44 No N/A urd-gds-000 urd-gds-volume /urd-gds/gluster1 geouser geouser at urd-gds-geo-001::urd-gds-volume urd-gds-geo-001 Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A urd-gds-000 urd-gds-volume /urd-gds/gluster2 geouser geouser at urd-gds-geo-001::urd-gds-volume urd-gds-geo-001 Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A Master cluster is Distribute-Replicate 2 x (2 + 1) Used space 30TB Slave cluster is Replicate 1 x (2 + 1) Used space 9TB Parts from gsyncd.logs are enclosed. Thanks a lot! Best regards Marcus Peders?n --- N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/> E-mailing SLU will result in SLU processing your personal data. For more information on how this is done, click here <https://www.slu.se/en/about-slu/contact-slu/personal-data/> _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org<mailto:Gluster-users at gluster.org> https://lists.gluster.org/mailman/listinfo/gluster-users -- Thanks and Regards, Kotresh H R --- N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/> E-mailing SLU will result in SLU processing your personal data. For more information on how this is done, click here <https://www.slu.se/en/about-slu/contact-slu/personal-data/> -- Thanks and Regards, Kotresh H R --- N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/> E-mailing SLU will result in SLU processing your personal data. For more information on how this is done, click here <https://www.slu.se/en/about-slu/contact-slu/personal-data/> -- Thanks and Regards, Kotresh H R --- N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/> E-mailing SLU will result in SLU processing your personal data. For more information on how this is done, click here <https://www.slu.se/en/about-slu/contact-slu/personal-data/> --- N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/> E-mailing SLU will result in SLU processing your personal data. For more information on how this is done, click here <https://www.slu.se/en/about-slu/contact-slu/personal-data/> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180806/658f0d9b/attachment.html>