Strahil Nikolov
2021-Mar-11 17:37 UTC
[Gluster-users] GeoRep Faulty after Gluster 7 to 8 upgrade - gfchangelog: wrong result
I think you have to increase the debug logs for geo-rep session. I will try to find the command necessary to increase it. Best Regards, Strahil Nikolov ? ?????????, 11 ???? 2021 ?., 00:38:41 ?. ???????+2, Matthew Benstead <matthewb at uvic.ca> ??????: Thanks Strahil, Right - I had come across your message in early January that v8 from the CentOS Sig was missing the SELinux rules, and had put SELinux into permissive mode after the upgrade when I saw denied messages in the audit logs. [root at storage01 ~]# sestatus | egrep "^SELinux status|[mM]ode" SELinux status:???????????????? enabled Current mode:?????????????????? permissive Mode from config file:????????? enforcing Yes - I am using an unprivileged user for georep:? [root at pcic-backup01 ~]# gluster-mountbroker status +-------------+-------------+---------------------------+--------------+--------------------------+ |???? NODE??? | NODE STATUS |???????? MOUNT ROOT??????? |??? GROUP???? |????????? USERS?????????? | +-------------+-------------+---------------------------+--------------+--------------------------+ | 10.0.231.82 |????????? UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup)? | |? localhost? |????????? UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup)? | +-------------+-------------+---------------------------+--------------+--------------------------+ [root at pcic-backup02 ~]# gluster-mountbroker status +-------------+-------------+---------------------------+--------------+--------------------------+ |???? NODE??? | NODE STATUS |???????? MOUNT ROOT??????? |??? GROUP???? |????????? USERS?????????? | +-------------+-------------+---------------------------+--------------+--------------------------+ | 10.0.231.81 |????????? UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup)? | |? localhost? |????????? UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup)? | +-------------+-------------+---------------------------+--------------+--------------------------+ Thanks, ?-Matthew -- Matthew Benstead System AdministratorPacific Climate Impacts ConsortiumUniversity of Victoria, UH1PO Box 1800, STN CSCVictoria, BC, V8W 2Y2Phone: +1-250-721-8432Email: matthewb at uvic.ca On 3/10/21 2:11 PM, Strahil Nikolov wrote:>?? >??Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information. > > > I have tested georep on v8.3 and it was running quite well untill you involve SELINUX. > > > > Are you using SELINUX ? > > Are you using unprivileged user for the georep ? > > > > > Also, you can check?https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html/administration_guide/sect-troubleshooting_geo-replication . > > > > > Best Regards, > > Strahil Nikolov > > >>?? >>?? >> On Thu, Mar 11, 2021 at 0:03, Matthew Benstead >> >> <matthewb at uvic.ca> wrote: >> >> >>?? >>?? >> Hello, >> >> I recently upgraded my Distributed-Replicate cluster from Gluster 7.9 to 8.3 on CentOS7 using the CentOS Storage SIG packages. I had geo-replication syncing properly before the upgrade, but not it is not working after. >> >> After I had upgraded both master and slave clusters I attempted to start geo-replication again, but it goes to faulty quickly: >> >> [root at storage01 ~]# gluster volume geo-replication storage??geoaccount at 10.0.231.81::pcic-backup start >> Starting geo-replication session between storage &??geoaccount at 10.0.231.81::pcic-backup has been successful\ >> ????? >> [root at storage01 ~]# gluster volume geo-replication status >> ? >> MASTER NODE??? MASTER VOL??? MASTER BRICK?????????????? SLAVE USER??? SLAVE??????????????????????????????????????? SLAVE NODE??? STATUS??? CRAWL STATUS??? LAST_SYNCED????????? >> --------------------------------------------------------------------------------------------------------------------------------------------------------------------- >> 10.0.231.91??? storage?????? /data/storage_a/storage??? geoaccount?????ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? >> 10.0.231.91??? storage?????? /data/storage_c/storage??? geoaccount?????ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? >> 10.0.231.91??? storage?????? /data/storage_b/storage??? geoaccount?????ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? >> 10.0.231.92??? storage?????? /data/storage_b/storage??? geoaccount?????ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? >> 10.0.231.92??? storage?????? /data/storage_a/storage??? geoaccount?????ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? >> 10.0.231.92??? storage?????? /data/storage_c/storage??? geoaccount?????ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? >> 10.0.231.93??? storage?????? /data/storage_c/storage??? geoaccount?????ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? >> 10.0.231.93??? storage?????? /data/storage_b/storage??? geoaccount?????ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? >> 10.0.231.93??? storage?????? /data/storage_a/storage??? geoaccount?????ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? >> >> [root at storage01 ~]# gluster volume geo-replication storage??geoaccount at 10.0.231.81::pcic-backup stop >> Stopping geo-replication session between storage &??geoaccount at 10.0.231.81::pcic-backup has been successful >> >> >> I went through the gsyncd logs and see it attempts to go back through the changlogs - which would make sense - but fails: >> >> [2021-03-10 19:18:42.165807] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] >> [2021-03-10 19:18:42.166136] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_a/storage}, {slave_node=10.0.231.81}] >> [2021-03-10 19:18:42.167829] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_c/storage}, {slave_node=10.0.231.82}] >> [2021-03-10 19:18:42.172343] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] >> [2021-03-10 19:18:42.172580] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_b/storage}, {slave_node=10.0.231.82}] >> [2021-03-10 19:18:42.235574] I [resource(worker /data/storage_c/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... >> [2021-03-10 19:18:42.236613] I [resource(worker /data/storage_a/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... >> [2021-03-10 19:18:42.238614] I [resource(worker /data/storage_b/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... >> [2021-03-10 19:18:44.144856] I [resource(worker /data/storage_b/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9059}] >> [2021-03-10 19:18:44.145065] I [resource(worker /data/storage_b/storage):1116:connect] GLUSTER: Mounting gluster volume locally... >> [2021-03-10 19:18:44.162873] I [resource(worker /data/storage_a/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9259}] >> [2021-03-10 19:18:44.163412] I [resource(worker /data/storage_a/storage):1116:connect] GLUSTER: Mounting gluster volume locally... >> [2021-03-10 19:18:44.167506] I [resource(worker /data/storage_c/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9316}] >> [2021-03-10 19:18:44.167746] I [resource(worker /data/storage_c/storage):1116:connect] GLUSTER: Mounting gluster volume locally... >> [2021-03-10 19:18:45.251372] I [resource(worker /data/storage_b/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1062}] >> [2021-03-10 19:18:45.251583] I [subcmds(worker /data/storage_b/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor >> [2021-03-10 19:18:45.271950] I [resource(worker /data/storage_c/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1041}] >> [2021-03-10 19:18:45.272118] I [subcmds(worker /data/storage_c/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor >> [2021-03-10 19:18:45.275180] I [resource(worker /data/storage_a/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1116}] >> [2021-03-10 19:18:45.275361] I [subcmds(worker /data/storage_a/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor >> [2021-03-10 19:18:47.265618] I [master(worker /data/storage_b/storage):1645:register] _GMaster: Working dir [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage}] >> [2021-03-10 19:18:47.265954] I [resource(worker /data/storage_b/storage):1292:service_loop] GLUSTER: Register time [{time=1615403927}] >> [2021-03-10 19:18:47.276746] I [gsyncdstatus(worker /data/storage_b/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}] >> [2021-03-10 19:18:47.281194] I [gsyncdstatus(worker /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}] >> [2021-03-10 19:18:47.281404] I [master(worker /data/storage_b/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666552, 0)}, {entry_stime=(1614664113, 0)}, {etime=1615403927}] >> [2021-03-10 19:18:47.285340] I [master(worker /data/storage_c/storage):1645:register] _GMaster: Working dir [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage}] >> [2021-03-10 19:18:47.285579] I [resource(worker /data/storage_c/storage):1292:service_loop] GLUSTER: Register time [{time=1615403927}] >> [2021-03-10 19:18:47.287383] I [master(worker /data/storage_a/storage):1645:register] _GMaster: Working dir [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage}] >> [2021-03-10 19:18:47.287697] I [resource(worker /data/storage_a/storage):1292:service_loop] GLUSTER: Register time [{time=1615403927}] >> [2021-03-10 19:18:47.298415] I [gsyncdstatus(worker /data/storage_c/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}] >> [2021-03-10 19:18:47.301342] I [gsyncdstatus(worker /data/storage_a/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}] >> [2021-03-10 19:18:47.304183] I [gsyncdstatus(worker /data/storage_c/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}] >> [2021-03-10 19:18:47.304418] I [master(worker /data/storage_c/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666552, 0)}, {entry_stime=(1614664108, 0)}, {etime=1615403927}] >> [2021-03-10 19:18:47.305294] E [resource(worker /data/storage_c/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}] >> [2021-03-10 19:18:47.308124] I [gsyncdstatus(worker /data/storage_a/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}] >> [2021-03-10 19:18:47.308509] I [master(worker /data/storage_a/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666553, 0)}, {entry_stime=(1614664115, 0)}, {etime=1615403927}] >> [2021-03-10 19:18:47.357470] E [resource(worker /data/storage_b/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}] >> [2021-03-10 19:18:47.383949] E [resource(worker /data/storage_a/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}] >> [2021-03-10 19:18:48.255340] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_b/storage}] >> [2021-03-10 19:18:48.260052] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] >> [2021-03-10 19:18:48.275651] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_c/storage}] >> [2021-03-10 19:18:48.278064] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_a/storage}] >> [2021-03-10 19:18:48.280453] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] >> [2021-03-10 19:18:48.282274] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] >> [2021-03-10 19:18:58.275702] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] >> [2021-03-10 19:18:58.276041] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_b/storage}, {slave_node=10.0.231.82}] >> [2021-03-10 19:18:58.296252] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] >> [2021-03-10 19:18:58.296506] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_c/storage}, {slave_node=10.0.231.82}] >> [2021-03-10 19:18:58.301290] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] >> [2021-03-10 19:18:58.301521] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_a/storage}, {slave_node=10.0.231.81}] >> [2021-03-10 19:18:58.345817] I [resource(worker /data/storage_b/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... >> [2021-03-10 19:18:58.361268] I [resource(worker /data/storage_c/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... >> [2021-03-10 19:18:58.367985] I [resource(worker /data/storage_a/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... >> [2021-03-10 19:18:59.115143] I [subcmds(monitor-status):29:subcmd_monitor_status] <top>: Monitor Status Change [{status=Stopped}] >> >> It seems like there is an issue selecting the changelogs - perhaps similar to this issue???https://github.com/gluster/glusterfs/issues/1766 >> >> [root at storage01 storage_10.0.231.81_pcic-backup]# cat changes-data-storage_a-storage.log >> [2021-03-10 19:18:45.284764] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_a/storage}, {notify_filter=1}] >> [2021-03-10 19:18:45.285275] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}] >> [2021-03-10 19:18:45.285269] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}] >> [2021-03-10 19:18:45.286615] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 21 >> [2021-03-10 19:18:47.308607] I [MSGID: 132035] [gf-history-changelog.c:837:gf_history_changelog] 0-gfchangelog: Requesting historical changelogs [{start=1614666553}, {end=1615403927}] >> [2021-03-10 19:18:47.308659] I [MSGID: 132019] [gf-history-changelog.c:755:gf_changelog_extract_min_max] 0-gfchangelog: changelogs min max [{min=1597342860}, {max=1615403927}, {total_changelogs=1250878}] >> [2021-03-10 19:18:47.383774] E [MSGID: 132009] [gf-history-changelog.c:941:gf_history_changelog] 0-gfchangelog: wrong result [{for=end}, {start=1615403927}, {idx=1250877}] >> >> [root at storage01 storage_10.0.231.81_pcic-backup]# tail -7 changes-data-storage_b-storage.log >> [2021-03-10 19:18:45.263211] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}] >> [2021-03-10 19:18:45.263151] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_b/storage}, {notify_filter=1}] >> [2021-03-10 19:18:45.263294] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}] >> [2021-03-10 19:18:45.264598] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 23 >> [2021-03-10 19:18:47.281499] I [MSGID: 132035] [gf-history-changelog.c:837:gf_history_changelog] 0-gfchangelog: Requesting historical changelogs [{start=1614666552}, {end=1615403927}] >> [2021-03-10 19:18:47.281551] I [MSGID: 132019] [gf-history-changelog.c:755:gf_changelog_extract_min_max] 0-gfchangelog: changelogs min max [{min=1597342860}, {max=1615403927}, {total_changelogs=1258258}] >> [2021-03-10 19:18:47.357244] E [MSGID: 132009] [gf-history-changelog.c:941:gf_history_changelog] 0-gfchangelog: wrong result [{for=end}, {start=1615403927}, {idx=1258257}] >> >> Any ideas on where to debug this? I'd prefer not to have to remove and re-sync everything as there is about 240TB on the cluster... >> >> Thanks, >> ?-Matthew >> >> >> ________ >> >> >> >> Community Meeting Calendar: >> >> Schedule - >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >> Bridge: https://meet.google.com/cpu-eiue-hvk >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users >> >> > > >
Matthew Benstead
2021-Mar-11 19:36 UTC
[Gluster-users] GeoRep Faulty after Gluster 7 to 8 upgrade - gfchangelog: wrong result
Hi Strahil, It looks like perhaps the changelog_log_level and log_level options? I've set them to debug: [root at storage01 ~]# gluster volume geo-replication storage geoaccount at 10.0.231.81::pcic-backup config | egrep -i "log_level" changelog_log_level:INFO cli_log_level:INFO gluster_log_level:INFO log_level:INFO slave_gluster_log_level:INFO slave_log_level:INFO [root at storage01 ~]# gluster volume geo-replication storage geoaccount at 10.0.231.81::pcic-backup config changelog_log_level DEBUG geo-replication config updated successfully [root at storage01 ~]# gluster volume geo-replication storage geoaccount at 10.0.231.81::pcic-backup config log_level DEBUG geo-replication config updated successfully Then I restarted geo-replication: [root at storage01 ~]# gluster volume geo-replication storage geoaccount at 10.0.231.81::pcic-backup start Starting geo-replication session between storage & geoaccount at 10.0.231.81::pcic-backup has been successful [root at storage01 ~]# gluster volume geo-replication status ? MASTER NODE??? MASTER VOL??? MASTER BRICK?????????????? SLAVE USER??? SLAVE??????????????????????????????????????? SLAVE NODE??? STATUS???????????? CRAWL STATUS??? LAST_SYNCED????????? ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 10.0.231.91??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.91??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.91??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.92??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.92??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.92??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.93??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.93??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.93??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? [root at storage01 ~]# gluster volume geo-replication status ? MASTER NODE??? MASTER VOL??? MASTER BRICK?????????????? SLAVE USER??? SLAVE??????????????????????????????????????? SLAVE NODE??? STATUS??? CRAWL STATUS??? LAST_SYNCED????????? --------------------------------------------------------------------------------------------------------------------------------------------------------------------- 10.0.231.91??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.91??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.91??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.92??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.92??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.92??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.93??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.93??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.93??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A? [root at storage01 ~]# gluster volume geo-replication storage geoaccount at 10.0.231.81::pcic-backup stop Stopping geo-replication session between storage & geoaccount at 10.0.231.81::pcic-backup has been successful The changelogs didn't really show anything new around changelog selection: [root at storage01 storage_10.0.231.81_pcic-backup]# cat changes-data-storage_a-storage.log | egrep "2021-03-11" [2021-03-11 19:15:30.552889] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_a/storage}, {notify_filter=1}] [2021-03-11 19:15:30.552893] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=0}] [2021-03-11 19:15:30.552894] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=1}] [2021-03-11 19:15:30.553633] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}] [2021-03-11 19:15:30.553634] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}] [2021-03-11 19:15:30.554236] D [rpcsvc.c:2831:rpcsvc_init] 0-rpc-service: RPC service inited. [2021-03-11 19:15:30.554403] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: GF-DUMP, Num: 123451501, Ver: 1, Port: 0 [2021-03-11 19:15:30.554420] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so [2021-03-11 19:15:30.554933] D [socket.c:4485:socket_init] 0-socket.gfchangelog: disabling nodelay [2021-03-11 19:15:30.554944] D [socket.c:4523:socket_init] 0-socket.gfchangelog: Configured transport.tcp-user-timeout=42 [2021-03-11 19:15:30.554949] D [socket.c:4543:socket_init] 0-socket.gfchangelog: Reconfigured transport.keepalivecnt=9 [2021-03-11 19:15:30.555002] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 23 [2021-03-11 19:15:30.555324] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: LIBGFCHANGELOG REBORP, Num: 1886350951, Ver: 1, Port: 0 [2021-03-11 19:15:30.555345] D [rpc-clnt.c:1020:rpc_clnt_connection_init] 0-gfchangelog: defaulting frame-timeout to 30mins [2021-03-11 19:15:30.555351] D [rpc-clnt.c:1032:rpc_clnt_connection_init] 0-gfchangelog: disable ping-timeout [2021-03-11 19:15:30.555358] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so [2021-03-11 19:15:30.555399] D [socket.c:4485:socket_init] 0-gfchangelog: disabling nodelay [2021-03-11 19:15:30.555406] D [socket.c:4523:socket_init] 0-gfchangelog: Configured transport.tcp-user-timeout=42 [2021-03-11 19:15:32.555711] D [rpc-clnt-ping.c:298:rpc_clnt_start_ping] 0-gfchangelog: ping timeout is 0, returning [2021-03-11 19:15:32.572157] I [MSGID: 132035] [gf-history-changelog.c:837:gf_history_changelog] 0-gfchangelog: Requesting historical changelogs [{start=1614666553}, {end=1615490132}] [2021-03-11 19:15:32.572436] I [MSGID: 132019] [gf-history-changelog.c:755:gf_changelog_extract_min_max] 0-gfchangelog: changelogs min max [{min=1597342860}, {max=1615490121}, {total_changelogs=1256897}] [2021-03-11 19:15:32.621244] E [MSGID: 132009] [gf-history-changelog.c:941:gf_history_changelog] 0-gfchangelog: wrong result [{for=end}, {start=1615490121}, {idx=1256896}] [2021-03-11 19:15:46.733182] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_a/storage}, {notify_filter=1}] [2021-03-11 19:15:46.733316] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=0}] [2021-03-11 19:15:46.733348] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=1}] [2021-03-11 19:15:46.734031] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}] [2021-03-11 19:15:46.734085] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}] [2021-03-11 19:15:46.734591] D [rpcsvc.c:2831:rpcsvc_init] 0-rpc-service: RPC service inited. [2021-03-11 19:15:46.734755] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: GF-DUMP, Num: 123451501, Ver: 1, Port: 0 [2021-03-11 19:15:46.734772] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so [2021-03-11 19:15:46.735256] D [socket.c:4485:socket_init] 0-socket.gfchangelog: disabling nodelay [2021-03-11 19:15:46.735266] D [socket.c:4523:socket_init] 0-socket.gfchangelog: Configured transport.tcp-user-timeout=42 [2021-03-11 19:15:46.735271] D [socket.c:4543:socket_init] 0-socket.gfchangelog: Reconfigured transport.keepalivecnt=9 [2021-03-11 19:15:46.735325] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 21 [2021-03-11 19:15:46.735704] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: LIBGFCHANGELOG REBORP, Num: 1886350951, Ver: 1, Port: 0 [2021-03-11 19:15:46.735721] D [rpc-clnt.c:1020:rpc_clnt_connection_init] 0-gfchangelog: defaulting frame-timeout to 30mins [2021-03-11 19:15:46.735726] D [rpc-clnt.c:1032:rpc_clnt_connection_init] 0-gfchangelog: disable ping-timeout [2021-03-11 19:15:46.735733] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so [2021-03-11 19:15:46.735771] D [socket.c:4485:socket_init] 0-gfchangelog: disabling nodelay [2021-03-11 19:15:46.735778] D [socket.c:4523:socket_init] 0-gfchangelog: Configured transport.tcp-user-timeout=42 [2021-03-11 19:15:47.618464] D [rpc-clnt-ping.c:298:rpc_clnt_start_ping] 0-gfchangelog: ping timeout is 0, returning [root at storage01 storage_10.0.231.81_pcic-backup]# cat changes-data-storage_b-storage.log | egrep "2021-03-11" [2021-03-11 19:15:30.611457] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_b/storage}, {notify_filter=1}] [2021-03-11 19:15:30.611574] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=1}] [2021-03-11 19:15:30.611641] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}] [2021-03-11 19:15:30.611645] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}] [2021-03-11 19:15:30.612325] D [rpcsvc.c:2831:rpcsvc_init] 0-rpc-service: RPC service inited. [2021-03-11 19:15:30.612488] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: GF-DUMP, Num: 123451501, Ver: 1, Port: 0 [2021-03-11 19:15:30.612507] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so [2021-03-11 19:15:30.613005] D [socket.c:4485:socket_init] 0-socket.gfchangelog: disabling nodelay [2021-03-11 19:15:30.613130] D [socket.c:4523:socket_init] 0-socket.gfchangelog: Configured transport.tcp-user-timeout=42 [2021-03-11 19:15:30.613142] D [socket.c:4543:socket_init] 0-socket.gfchangelog: Reconfigured transport.keepalivecnt=9 [2021-03-11 19:15:30.613208] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 22 [2021-03-11 19:15:30.613545] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: LIBGFCHANGELOG REBORP, Num: 1886350951, Ver: 1, Port: 0 [2021-03-11 19:15:30.613567] D [rpc-clnt.c:1020:rpc_clnt_connection_init] 0-gfchangelog: defaulting frame-timeout to 30mins [2021-03-11 19:15:30.613574] D [rpc-clnt.c:1032:rpc_clnt_connection_init] 0-gfchangelog: disable ping-timeout [2021-03-11 19:15:30.613582] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so [2021-03-11 19:15:30.613637] D [socket.c:4485:socket_init] 0-gfchangelog: disabling nodelay [2021-03-11 19:15:30.613654] D [socket.c:4523:socket_init] 0-gfchangelog: Configured transport.tcp-user-timeout=42 [2021-03-11 19:15:32.614273] D [rpc-clnt-ping.c:298:rpc_clnt_start_ping] 0-gfchangelog: ping timeout is 0, returning [2021-03-11 19:15:32.643628] I [MSGID: 132035] [gf-history-changelog.c:837:gf_history_changelog] 0-gfchangelog: Requesting historical changelogs [{start=1614666552}, {end=1615490132}] [2021-03-11 19:15:32.643716] I [MSGID: 132019] [gf-history-changelog.c:755:gf_changelog_extract_min_max] 0-gfchangelog: changelogs min max [{min=1597342860}, {max=1615490123}, {total_changelogs=1264296}] [2021-03-11 19:15:32.700397] E [MSGID: 132009] [gf-history-changelog.c:941:gf_history_changelog] 0-gfchangelog: wrong result [{for=end}, {start=1615490123}, {idx=1264295}] [2021-03-11 19:15:46.832322] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_b/storage}, {notify_filter=1}] [2021-03-11 19:15:46.832394] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=0}] [2021-03-11 19:15:46.832465] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=1}] [2021-03-11 19:15:46.832531] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}] [2021-03-11 19:15:46.833086] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}] [2021-03-11 19:15:46.833648] D [rpcsvc.c:2831:rpcsvc_init] 0-rpc-service: RPC service inited. [2021-03-11 19:15:46.833817] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: GF-DUMP, Num: 123451501, Ver: 1, Port: 0 [2021-03-11 19:15:46.833835] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so [2021-03-11 19:15:46.834368] D [socket.c:4485:socket_init] 0-socket.gfchangelog: disabling nodelay [2021-03-11 19:15:46.834380] D [socket.c:4523:socket_init] 0-socket.gfchangelog: Configured transport.tcp-user-timeout=42 [2021-03-11 19:15:46.834386] D [socket.c:4543:socket_init] 0-socket.gfchangelog: Reconfigured transport.keepalivecnt=9 [2021-03-11 19:15:46.834441] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 23 [2021-03-11 19:15:46.834768] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: LIBGFCHANGELOG REBORP, Num: 1886350951, Ver: 1, Port: 0 [2021-03-11 19:15:46.834789] D [rpc-clnt.c:1020:rpc_clnt_connection_init] 0-gfchangelog: defaulting frame-timeout to 30mins [2021-03-11 19:15:46.834795] D [rpc-clnt.c:1032:rpc_clnt_connection_init] 0-gfchangelog: disable ping-timeout [2021-03-11 19:15:46.834802] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so [2021-03-11 19:15:46.834845] D [socket.c:4485:socket_init] 0-gfchangelog: disabling nodelay [2021-03-11 19:15:46.834853] D [socket.c:4523:socket_init] 0-gfchangelog: Configured transport.tcp-user-timeout=42 [2021-03-11 19:15:47.618476] D [rpc-clnt-ping.c:298:rpc_clnt_start_ping] 0-gfchangelog: ping timeout is 0, returning gsyncd logged a lot but I'm not sure if it's helpful: [2021-03-11 19:15:00.41898] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:21.551302] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:21.631470] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:21.718386] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:21.804991] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:26.203999] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:26.284775] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:26.573355] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:26.653752] D [gsyncd(monitor):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:26.756994] D [monitor(monitor):304:distribute] <top>: master bricks: [{'host': '10.0.231.91', 'uuid': 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': '/data/storage_a/storage'}, {'host': '10.0.2 31.92', 'uuid': 'ebbd7b74-3cf8-4752-a71c-b0f0ca86c97d', 'dir': '/data/storage_b/storage'}, {'host': '10.0.231.93', 'uuid': '8b28b331-3780-46bc-9da3-fb27de4ab57b', 'dir': '/data/storage_c/storage'}, {'host': '10. 0.231.92', 'uuid': 'ebbd7b74-3cf8-4752-a71c-b0f0ca86c97d', 'dir': '/data/storage_a/storage'}, {'host': '10.0.231.93', 'uuid': '8b28b331-3780-46bc-9da3-fb27de4ab57b', 'dir': '/data/storage_b/storage'}, {'host': ' 10.0.231.91', 'uuid': 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': '/data/storage_c/storage'}, {'host': '10.0.231.93', 'uuid': '8b28b331-3780-46bc-9da3-fb27de4ab57b', 'dir': '/data/storage_a/storage'}, {'host' : '10.0.231.91', 'uuid': 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': '/data/storage_b/storage'}, {'host': '10.0.231.92', 'uuid': 'ebbd7b74-3cf8-4752-a71c-b0f0ca86c97d', 'dir': '/data/storage_c/storage'}] [2021-03-11 19:15:26.757252] D [monitor(monitor):314:distribute] <top>: slave SSH gateway: geoaccount at 10.0.231.81 [2021-03-11 19:15:27.416235] D [monitor(monitor):334:distribute] <top>: slave bricks: [{'host': '10.0.231.81', 'uuid': 'b88dea4f-31ec-416a-9110-3ccdc3910acd', 'dir': '/data/brick'}, {'host': '10.0.231.82', 'uuid ': 'be50a8de-3934-4fee-a80d-8e2e99017902', 'dir': '/data/brick'}] [2021-03-11 19:15:27.416825] D [syncdutils(monitor):932:is_hot] Volinfo: brickpath: '10.0.231.91:/data/storage_a/storage' [2021-03-11 19:15:27.417273] D [syncdutils(monitor):932:is_hot] Volinfo: brickpath: '10.0.231.91:/data/storage_c/storage' [2021-03-11 19:15:27.417515] D [syncdutils(monitor):932:is_hot] Volinfo: brickpath: '10.0.231.91:/data/storage_b/storage' [2021-03-11 19:15:27.417763] D [monitor(monitor):348:distribute] <top>: worker specs: [({'host': '10.0.231.91', 'uuid': 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': '/data/storage_a/storage'}, ('geoaccount at 10. 0.231.81', 'b88dea4f-31ec-416a-9110-3ccdc3910acd'), '1', False), ({'host': '10.0.231.91', 'uuid': 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': '/data/storage_c/storage'}, ('geoaccount at 10.0.231.82', 'be50a8de-3 934-4fee-a80d-8e2e99017902'), '2', False), ({'host': '10.0.231.91', 'uuid': 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': '/data/storage_b/storage'}, ('geoaccount at 10.0.231.82', 'be50a8de-3934-4fee-a80d-8e2e9901 7902'), '3', False)] [2021-03-11 19:15:27.425009] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_c/storage}, {slave_node=10.0.231.82}] [2021-03-11 19:15:27.426764] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_b/storage}, {slave_node=10.0.231.82}] [2021-03-11 19:15:27.429208] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_a/storage}, {slave_node=10.0.231.81}] [2021-03-11 19:15:27.432280] D [monitor(monitor):195:monitor] Monitor: Worker would mount volume privately [2021-03-11 19:15:27.434195] D [monitor(monitor):195:monitor] Monitor: Worker would mount volume privately [2021-03-11 19:15:27.436584] D [monitor(monitor):195:monitor] Monitor: Worker would mount volume privately [2021-03-11 19:15:27.478806] D [gsyncd(worker /data/storage_c/storage):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:27.478852] D [gsyncd(worker /data/storage_b/storage):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:27.480104] D [gsyncd(worker /data/storage_a/storage):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:27.500456] I [resource(worker /data/storage_c/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... [2021-03-11 19:15:27.501375] I [resource(worker /data/storage_b/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... [2021-03-11 19:15:27.502003] I [resource(worker /data/storage_a/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... [2021-03-11 19:15:27.525511] D [repce(worker /data/storage_a/storage):195:push] RepceClient: call 192117:140572692309824:1615490127.53 __repce_version__() ... [2021-03-11 19:15:27.525582] D [repce(worker /data/storage_b/storage):195:push] RepceClient: call 192115:139891296405312:1615490127.53 __repce_version__() ... [2021-03-11 19:15:27.526089] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 192114:140388828780352:1615490127.53 __repce_version__() ... [2021-03-11 19:15:29.435985] D [repce(worker /data/storage_a/storage):215:__call__] RepceClient: call 192117:140572692309824:1615490127.53 __repce_version__ -> 1.0 [2021-03-11 19:15:29.436213] D [repce(worker /data/storage_a/storage):195:push] RepceClient: call 192117:140572692309824:1615490129.44 version() ... [2021-03-11 19:15:29.437136] D [repce(worker /data/storage_a/storage):215:__call__] RepceClient: call 192117:140572692309824:1615490129.44 version -> 1.0 [2021-03-11 19:15:29.437268] D [repce(worker /data/storage_a/storage):195:push] RepceClient: call 192117:140572692309824:1615490129.44 pid() ... [2021-03-11 19:15:29.437915] D [repce(worker /data/storage_a/storage):215:__call__] RepceClient: call 192117:140572692309824:1615490129.44 pid -> 157321 [2021-03-11 19:15:29.438004] I [resource(worker /data/storage_a/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9359}] [2021-03-11 19:15:29.438072] I [resource(worker /data/storage_a/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:29.494538] D [repce(worker /data/storage_b/storage):215:__call__] RepceClient: call 192115:139891296405312:1615490127.53 __repce_version__ -> 1.0 [2021-03-11 19:15:29.494748] D [repce(worker /data/storage_b/storage):195:push] RepceClient: call 192115:139891296405312:1615490129.49 version() ... [2021-03-11 19:15:29.495290] D [repce(worker /data/storage_b/storage):215:__call__] RepceClient: call 192115:139891296405312:1615490129.49 version -> 1.0 [2021-03-11 19:15:29.495400] D [repce(worker /data/storage_b/storage):195:push] RepceClient: call 192115:139891296405312:1615490129.5 pid() ... [2021-03-11 19:15:29.495872] D [repce(worker /data/storage_b/storage):215:__call__] RepceClient: call 192115:139891296405312:1615490129.5 pid -> 88110 [2021-03-11 19:15:29.495960] I [resource(worker /data/storage_b/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9944}] [2021-03-11 19:15:29.496028] I [resource(worker /data/storage_b/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:29.501255] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 192114:140388828780352:1615490127.53 __repce_version__ -> 1.0 [2021-03-11 19:15:29.501454] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 192114:140388828780352:1615490129.5 version() ... [2021-03-11 19:15:29.502258] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 192114:140388828780352:1615490129.5 version -> 1.0 [2021-03-11 19:15:29.502444] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 192114:140388828780352:1615490129.5 pid() ... [2021-03-11 19:15:29.503140] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 192114:140388828780352:1615490129.5 pid -> 88111 [2021-03-11 19:15:29.503232] I [resource(worker /data/storage_c/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=2.0026}] [2021-03-11 19:15:29.503302] I [resource(worker /data/storage_c/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:29.533899] D [resource(worker /data/storage_a/storage):880:inhibit] DirectMounter: auxiliary glusterfs mount in place [2021-03-11 19:15:29.595736] D [resource(worker /data/storage_b/storage):880:inhibit] DirectMounter: auxiliary glusterfs mount in place [2021-03-11 19:15:29.601110] D [resource(worker /data/storage_c/storage):880:inhibit] DirectMounter: auxiliary glusterfs mount in place [2021-03-11 19:15:30.541542] D [resource(worker /data/storage_a/storage):964:inhibit] DirectMounter: auxiliary glusterfs mount prepared [2021-03-11 19:15:30.541816] I [resource(worker /data/storage_a/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1037}] [2021-03-11 19:15:30.541887] I [subcmds(worker /data/storage_a/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor [2021-03-11 19:15:30.542042] D [master(worker /data/storage_a/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=xsync}] [2021-03-11 19:15:30.542125] D [monitor(monitor):222:monitor] Monitor: worker(/data/storage_a/storage) connected [2021-03-11 19:15:30.543323] D [master(worker /data/storage_a/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changelog}] [2021-03-11 19:15:30.544460] D [master(worker /data/storage_a/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changeloghistory}] [2021-03-11 19:15:30.552103] D [master(worker /data/storage_a/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage [2021-03-11 19:15:30.602937] D [resource(worker /data/storage_b/storage):964:inhibit] DirectMounter: auxiliary glusterfs mount prepared [2021-03-11 19:15:30.603117] I [resource(worker /data/storage_b/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1070}] [2021-03-11 19:15:30.603197] I [subcmds(worker /data/storage_b/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor [2021-03-11 19:15:30.603353] D [master(worker /data/storage_b/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=xsync}] [2021-03-11 19:15:30.603338] D [monitor(monitor):222:monitor] Monitor: worker(/data/storage_b/storage) connected [2021-03-11 19:15:30.604620] D [master(worker /data/storage_b/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changelog}] [2021-03-11 19:15:30.605600] D [master(worker /data/storage_b/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changeloghistory}] [2021-03-11 19:15:30.608365] D [resource(worker /data/storage_c/storage):964:inhibit] DirectMounter: auxiliary glusterfs mount prepared [2021-03-11 19:15:30.608534] I [resource(worker /data/storage_c/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1052}] [2021-03-11 19:15:30.608612] I [subcmds(worker /data/storage_c/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor [2021-03-11 19:15:30.608762] D [master(worker /data/storage_c/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=xsync}] [2021-03-11 19:15:30.608779] D [monitor(monitor):222:monitor] Monitor: worker(/data/storage_c/storage) connected [2021-03-11 19:15:30.610033] D [master(worker /data/storage_c/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changelog}] [2021-03-11 19:15:30.610637] D [master(worker /data/storage_b/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage [2021-03-11 19:15:30.610970] D [master(worker /data/storage_c/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changeloghistory}] [2021-03-11 19:15:30.616197] D [master(worker /data/storage_c/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage [2021-03-11 19:15:31.371265] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:31.451000] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:31.537257] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:31.623800] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:32.555840] D [master(worker /data/storage_a/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage [2021-03-11 19:15:32.556051] D [master(worker /data/storage_a/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage [2021-03-11 19:15:32.556122] D [master(worker /data/storage_a/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage [2021-03-11 19:15:32.556179] I [master(worker /data/storage_a/storage):1645:register] _GMaster: Working dir [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage}] [2021-03-11 19:15:32.556359] I [resource(worker /data/storage_a/storage):1292:service_loop] GLUSTER: Register time [{time=1615490132}] [2021-03-11 19:15:32.556823] D [repce(worker /data/storage_a/storage):195:push] RepceClient: call 192117:140570487928576:1615490132.56 keep_alive(None,) ... [2021-03-11 19:15:32.558429] D [repce(worker /data/storage_a/storage):215:__call__] RepceClient: call 192117:140570487928576:1615490132.56 keep_alive -> 1 [2021-03-11 19:15:32.558974] D [master(worker /data/storage_a/storage):540:crawlwrap] _GMaster: primary master with volume id cf94a8f2-324b-40b3-bf72-c3766100ea99 ... [2021-03-11 19:15:32.567478] I [gsyncdstatus(worker /data/storage_a/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}] [2021-03-11 19:15:32.571824] I [gsyncdstatus(worker /data/storage_a/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}] [2021-03-11 19:15:32.572052] I [master(worker /data/storage_a/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666553, 0)}, {entry_stime=(1614664115, 0)}, {etime=1615490132}] [2021-03-11 19:15:32.614506] D [master(worker /data/storage_b/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage [2021-03-11 19:15:32.614701] D [master(worker /data/storage_b/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage [2021-03-11 19:15:32.614788] D [master(worker /data/storage_b/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage [2021-03-11 19:15:32.614845] I [master(worker /data/storage_b/storage):1645:register] _GMaster: Working dir [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage}] [2021-03-11 19:15:32.615000] I [resource(worker /data/storage_b/storage):1292:service_loop] GLUSTER: Register time [{time=1615490132}] [2021-03-11 19:15:32.615586] D [repce(worker /data/storage_b/storage):195:push] RepceClient: call 192115:139889215526656:1615490132.62 keep_alive(None,) ... [2021-03-11 19:15:32.617373] D [repce(worker /data/storage_b/storage):215:__call__] RepceClient: call 192115:139889215526656:1615490132.62 keep_alive -> 1 [2021-03-11 19:15:32.618144] D [master(worker /data/storage_b/storage):540:crawlwrap] _GMaster: primary master with volume id cf94a8f2-324b-40b3-bf72-c3766100ea99 ... [2021-03-11 19:15:32.619323] D [master(worker /data/storage_c/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage [2021-03-11 19:15:32.619491] D [master(worker /data/storage_c/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage [2021-03-11 19:15:32.619739] D [master(worker /data/storage_c/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage [2021-03-11 19:15:32.619863] I [master(worker /data/storage_c/storage):1645:register] _GMaster: Working dir [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage}] [2021-03-11 19:15:32.620040] I [resource(worker /data/storage_c/storage):1292:service_loop] GLUSTER: Register time [{time=1615490132}] [2021-03-11 19:15:32.620599] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 192114:140386886469376:1615490132.62 keep_alive(None,) ... [2021-03-11 19:15:32.621397] E [resource(worker /data/storage_a/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}] [2021-03-11 19:15:32.622035] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 192114:140386886469376:1615490132.62 keep_alive -> 1 [2021-03-11 19:15:32.622701] D [master(worker /data/storage_c/storage):540:crawlwrap] _GMaster: primary master with volume id cf94a8f2-324b-40b3-bf72-c3766100ea99 ... [2021-03-11 19:15:32.627031] I [gsyncdstatus(worker /data/storage_b/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}] [2021-03-11 19:15:32.643184] I [gsyncdstatus(worker /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}] [2021-03-11 19:15:32.643528] I [master(worker /data/storage_b/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666552, 0)}, {entry_stime=(1614664113, 0)}, {etime=1615490132}] [2021-03-11 19:15:32.645148] I [gsyncdstatus(worker /data/storage_c/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}] [2021-03-11 19:15:32.649631] I [gsyncdstatus(worker /data/storage_c/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}] [2021-03-11 19:15:32.649882] I [master(worker /data/storage_c/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666552, 0)}, {entry_stime=(1614664108, 0)}, {etime=1615490132}] [2021-03-11 19:15:32.650907] E [resource(worker /data/storage_c/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}] [2021-03-11 19:15:32.700489] E [resource(worker /data/storage_b/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}] [2021-03-11 19:15:33.545886] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_a/storage}] [2021-03-11 19:15:33.550487] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] [2021-03-11 19:15:33.606991] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_b/storage}] [2021-03-11 19:15:33.611573] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] [2021-03-11 19:15:33.612337] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_c/storage}] [2021-03-11 19:15:33.615777] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] [2021-03-11 19:15:34.684247] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:34.764971] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:34.851174] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:34.937166] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:36.994502] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:37.73805] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:37.159288] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:37.244153] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:38.916510] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:38.997649] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:39.84816] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:39.172045] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:40.896359] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:40.976135] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:41.62052] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:41.147902] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:42.791997] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:42.871239] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:42.956609] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:43.42473] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:43.566190] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] [2021-03-11 19:15:43.566400] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_a/storage}, {slave_node=10.0.231.81}] [2021-03-11 19:15:43.572240] D [monitor(monitor):195:monitor] Monitor: Worker would mount volume privately [2021-03-11 19:15:43.612744] D [gsyncd(worker /data/storage_a/storage):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:43.625689] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] [2021-03-11 19:15:43.626060] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_b/storage}, {slave_node=10.0.231.82}] [2021-03-11 19:15:43.632287] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] [2021-03-11 19:15:43.632137] D [monitor(monitor):195:monitor] Monitor: Worker would mount volume privately [2021-03-11 19:15:43.632508] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_c/storage}, {slave_node=10.0.231.82}] [2021-03-11 19:15:43.635565] I [resource(worker /data/storage_a/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... [2021-03-11 19:15:43.637835] D [monitor(monitor):195:monitor] Monitor: Worker would mount volume privately [2021-03-11 19:15:43.661304] D [repce(worker /data/storage_a/storage):195:push] RepceClient: call 192535:140367272073024:1615490143.66 __repce_version__() ... [2021-03-11 19:15:43.674499] D [gsyncd(worker /data/storage_b/storage):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:43.680706] D [gsyncd(worker /data/storage_c/storage):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:43.693773] I [resource(worker /data/storage_b/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... [2021-03-11 19:15:43.700957] I [resource(worker /data/storage_c/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... [2021-03-11 19:15:43.717686] D [repce(worker /data/storage_b/storage):195:push] RepceClient: call 192539:139907321804608:1615490143.72 __repce_version__() ... [2021-03-11 19:15:43.725369] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 192541:140653101852480:1615490143.73 __repce_version__() ... [2021-03-11 19:15:44.289117] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:44.375693] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:44.472251] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:44.558429] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:45.619694] D [repce(worker /data/storage_a/storage):215:__call__] RepceClient: call 192535:140367272073024:1615490143.66 __repce_version__ -> 1.0 [2021-03-11 19:15:45.619930] D [repce(worker /data/storage_a/storage):195:push] RepceClient: call 192535:140367272073024:1615490145.62 version() ... [2021-03-11 19:15:45.621191] D [repce(worker /data/storage_a/storage):215:__call__] RepceClient: call 192535:140367272073024:1615490145.62 version -> 1.0 [2021-03-11 19:15:45.621332] D [repce(worker /data/storage_a/storage):195:push] RepceClient: call 192535:140367272073024:1615490145.62 pid() ... [2021-03-11 19:15:45.621859] D [repce(worker /data/storage_a/storage):215:__call__] RepceClient: call 192535:140367272073024:1615490145.62 pid -> 158229 [2021-03-11 19:15:45.621939] I [resource(worker /data/storage_a/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9862}] [2021-03-11 19:15:45.622000] I [resource(worker /data/storage_a/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:45.714468] D [resource(worker /data/storage_a/storage):880:inhibit] DirectMounter: auxiliary glusterfs mount in place [2021-03-11 19:15:45.718441] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 192541:140653101852480:1615490143.73 __repce_version__ -> 1.0 [2021-03-11 19:15:45.718643] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 192541:140653101852480:1615490145.72 version() ... [2021-03-11 19:15:45.719492] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 192541:140653101852480:1615490145.72 version -> 1.0 [2021-03-11 19:15:45.719772] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 192541:140653101852480:1615490145.72 pid() ... [2021-03-11 19:15:45.720202] D [repce(worker /data/storage_b/storage):215:__call__] RepceClient: call 192539:139907321804608:1615490143.72 __repce_version__ -> 1.0 [2021-03-11 19:15:45.720381] D [repce(worker /data/storage_b/storage):195:push] RepceClient: call 192539:139907321804608:1615490145.72 version() ... [2021-03-11 19:15:45.720463] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 192541:140653101852480:1615490145.72 pid -> 88921 [2021-03-11 19:15:45.720694] I [resource(worker /data/storage_c/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=2.0196}] [2021-03-11 19:15:45.720882] I [resource(worker /data/storage_c/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:45.721146] D [repce(worker /data/storage_b/storage):215:__call__] RepceClient: call 192539:139907321804608:1615490145.72 version -> 1.0 [2021-03-11 19:15:45.721271] D [repce(worker /data/storage_b/storage):195:push] RepceClient: call 192539:139907321804608:1615490145.72 pid() ... [2021-03-11 19:15:45.721795] D [repce(worker /data/storage_b/storage):215:__call__] RepceClient: call 192539:139907321804608:1615490145.72 pid -> 88924 [2021-03-11 19:15:45.721911] I [resource(worker /data/storage_b/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=2.0280}] [2021-03-11 19:15:45.721993] I [resource(worker /data/storage_b/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:45.816891] D [resource(worker /data/storage_b/storage):880:inhibit] DirectMounter: auxiliary glusterfs mount in place [2021-03-11 19:15:45.816960] D [resource(worker /data/storage_c/storage):880:inhibit] DirectMounter: auxiliary glusterfs mount in place [2021-03-11 19:15:46.721534] D [resource(worker /data/storage_a/storage):964:inhibit] DirectMounter: auxiliary glusterfs mount prepared [2021-03-11 19:15:46.721726] I [resource(worker /data/storage_a/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.0997}] [2021-03-11 19:15:46.721796] I [subcmds(worker /data/storage_a/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor [2021-03-11 19:15:46.721971] D [master(worker /data/storage_a/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=xsync}] [2021-03-11 19:15:46.722122] D [monitor(monitor):222:monitor] Monitor: worker(/data/storage_a/storage) connected [2021-03-11 19:15:46.723871] D [master(worker /data/storage_a/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changelog}] [2021-03-11 19:15:46.725100] D [master(worker /data/storage_a/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changeloghistory}] [2021-03-11 19:15:46.732400] D [master(worker /data/storage_a/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage [2021-03-11 19:15:46.823477] D [resource(worker /data/storage_c/storage):964:inhibit] DirectMounter: auxiliary glusterfs mount prepared [2021-03-11 19:15:46.823645] I [resource(worker /data/storage_c/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1027}] [2021-03-11 19:15:46.823754] I [subcmds(worker /data/storage_c/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor [2021-03-11 19:15:46.823932] D [master(worker /data/storage_c/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=xsync}] [2021-03-11 19:15:46.823904] D [resource(worker /data/storage_b/storage):964:inhibit] DirectMounter: auxiliary glusterfs mount prepared [2021-03-11 19:15:46.823930] D [monitor(monitor):222:monitor] Monitor: worker(/data/storage_c/storage) connected [2021-03-11 19:15:46.824103] I [resource(worker /data/storage_b/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1020}] [2021-03-11 19:15:46.824184] I [subcmds(worker /data/storage_b/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor [2021-03-11 19:15:46.824340] D [master(worker /data/storage_b/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=xsync}] [2021-03-11 19:15:46.824321] D [monitor(monitor):222:monitor] Monitor: worker(/data/storage_b/storage) connected [2021-03-11 19:15:46.825100] D [master(worker /data/storage_c/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changelog}] [2021-03-11 19:15:46.825414] D [master(worker /data/storage_b/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changelog}] [2021-03-11 19:15:46.826375] D [master(worker /data/storage_b/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changeloghistory}] [2021-03-11 19:15:46.826574] D [master(worker /data/storage_c/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changeloghistory}] [2021-03-11 19:15:46.831506] D [master(worker /data/storage_b/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage [2021-03-11 19:15:46.833168] D [master(worker /data/storage_c/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage [2021-03-11 19:15:47.275141] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:47.320247] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:47.570877] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:47.615571] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:47.620893] E [syncdutils(worker /data/storage_a/storage):325:log_raise_exception] <top>: connection to peer is broken [2021-03-11 19:15:47.620939] E [syncdutils(worker /data/storage_c/storage):325:log_raise_exception] <top>: connection to peer is broken [2021-03-11 19:15:47.621668] E [syncdutils(worker /data/storage_a/storage):847:errlog] Popen: command returned error [{cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-_AyCOc/79fa3dc75e30f532b4a40bc08c2b10a1.sock geoaccount at 10.0.231.81 /nonexistent/gsyncd slave storage geoaccount at 10.0.231.81::pcic-backup --master-node 10.0.231.91 --master-node-id afc24654-2887-41f6-a9c2-8e835de243b6 --master-brick /data/storage_a/storage --local-node 10.0.231.81 --local-node-id b88dea4f-31ec-416a-9110-3ccdc3910acd --slave-timeout 120 --slave-log-level INFO --slave-gluster-log-level INFO --slave-gluster-command-dir /usr/sbin --master-dist-count 3}, {error=255}] [2021-03-11 19:15:47.621685] E [syncdutils(worker /data/storage_c/storage):847:errlog] Popen: command returned error [{cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-WOgOEu/e15fc58bb13552de0710eaf018209548.sock geoaccount at 10.0.231.82 /nonexistent/gsyncd slave storage geoaccount at 10.0.231.81::pcic-backup --master-node 10.0.231.91 --master-node-id afc24654-2887-41f6-a9c2-8e835de243b6 --master-brick /data/storage_c/storage --local-node 10.0.231.82 --local-node-id be50a8de-3934-4fee-a80d-8e2e99017902 --slave-timeout 120 --slave-log-level INFO --slave-gluster-log-level INFO --slave-gluster-command-dir /usr/sbin --master-dist-count 3}, {error=255}] [2021-03-11 19:15:47.621776] E [syncdutils(worker /data/storage_a/storage):851:logerr] Popen: ssh> Killed by signal 15. [2021-03-11 19:15:47.621819] E [syncdutils(worker /data/storage_c/storage):851:logerr] Popen: ssh> Killed by signal 15. [2021-03-11 19:15:47.621850] E [syncdutils(worker /data/storage_b/storage):325:log_raise_exception] <top>: connection to peer is broken [2021-03-11 19:15:47.622437] E [syncdutils(worker /data/storage_b/storage):847:errlog] Popen: command returned error [{cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-Vy935W/e15fc58bb13552de0710eaf018209548.sock geoaccount at 10.0.231.82 /nonexistent/gsyncd slave storage geoaccount at 10.0.231.81::pcic-backup --master-node 10.0.231.91 --master-node-id afc24654-2887-41f6-a9c2-8e835de243b6 --master-brick /data/storage_b/storage --local-node 10.0.231.82 --local-node-id be50a8de-3934-4fee-a80d-8e2e99017902 --slave-timeout 120 --slave-log-level INFO --slave-gluster-log-level INFO --slave-gluster-command-dir /usr/sbin --master-dist-count 3}, {error=255}] [2021-03-11 19:15:47.622556] E [syncdutils(worker /data/storage_b/storage):851:logerr] Popen: ssh> Killed by signal 15. [2021-03-11 19:15:47.723756] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_a/storage}] [2021-03-11 19:15:47.731405] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] [2021-03-11 19:15:47.825223] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_c/storage}] [2021-03-11 19:15:47.825685] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_b/storage}] [2021-03-11 19:15:47.829011] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] [2021-03-11 19:15:47.830965] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] [2021-03-11 19:15:48.669634] D [gsyncd(monitor-status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:48.683784] I [subcmds(monitor-status):29:subcmd_monitor_status] <top>: Monitor Status Change [{status=Stopped}] Thanks, ?-Matthew On 3/11/21 9:37 AM, Strahil Nikolov wrote:> Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information. > > > I think you have to increase the debug logs for geo-rep session. > I will try to find the command necessary to increase it. > > > Best Regards, > Strahil Nikolov > > > > > > > ? ?????????, 11 ???? 2021 ?., 00:38:41 ?. ???????+2, Matthew Benstead <matthewb at uvic.ca> ??????: > > > > > > Thanks Strahil, > > Right - I had come across your message in early January that v8 from the CentOS Sig was missing the SELinux rules, and had put SELinux into permissive mode after the upgrade when I saw denied messages in the audit logs. > > [root at storage01 ~]# sestatus | egrep "^SELinux status|[mM]ode" > SELinux status: enabled > Current mode: permissive > Mode from config file: enforcing > > Yes - I am using an unprivileged user for georep: > > [root at pcic-backup01 ~]# gluster-mountbroker status > +-------------+-------------+---------------------------+--------------+--------------------------+ > |???? NODE | NODE STATUS |???????? MOUNT ROOT |??? GROUP |????????? USERS | > +-------------+-------------+---------------------------+--------------+--------------------------+ > | 10.0.231.82 |????????? UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup) | > |? localhost |????????? UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup) | > +-------------+-------------+---------------------------+--------------+--------------------------+ > > [root at pcic-backup02 ~]# gluster-mountbroker status > +-------------+-------------+---------------------------+--------------+--------------------------+ > |???? NODE | NODE STATUS |???????? MOUNT ROOT |??? GROUP |????????? USERS | > +-------------+-------------+---------------------------+--------------+--------------------------+ > | 10.0.231.81 |????????? UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup) | > |? localhost |????????? UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup) | > +-------------+-------------+---------------------------+--------------+--------------------------+ > > Thanks, > -Matthew > > > -- > Matthew Benstead > System AdministratorPacific Climate Impacts ConsortiumUniversity of Victoria, UH1PO Box 1800, STN CSCVictoria, BC, V8W 2Y2Phone: +1-250-721-8432Email: matthewb at uvic.ca > > > On 3/10/21 2:11 PM, Strahil Nikolov wrote: > > >> ?? >> ??Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information. >> >> >> I have tested georep on v8.3 and it was running quite well untill you involve SELINUX. >> >> >> >> Are you using SELINUX ? >> >> Are you using unprivileged user for the georep ? >> >> >> >> >> Also, you can check https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html/administration_guide/sect-troubleshooting_geo-replication . >> >> >> >> >> Best Regards, >> >> Strahil Nikolov >> >> >>> ?? >>> ?? >>> On Thu, Mar 11, 2021 at 0:03, Matthew Benstead >>> >>> <matthewb at uvic.ca> wrote: >>> >>> >>> ?? >>> ?? >>> Hello, >>> >>> I recently upgraded my Distributed-Replicate cluster from Gluster 7.9 to 8.3 on CentOS7 using the CentOS Storage SIG packages. I had geo-replication syncing properly before the upgrade, but not it is not working after. >>> >>> After I had upgraded both master and slave clusters I attempted to start geo-replication again, but it goes to faulty quickly: >>> >>> [root at storage01 ~]# gluster volume geo-replication storage geoaccount at 10.0.231.81::pcic-backup start >>> Starting geo-replication session between storage &??geoaccount at 10.0.231.81::pcic-backup has been successful\ >>> >>> [root at storage01 ~]# gluster volume geo-replication status >>> >>> MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED >>> --------------------------------------------------------------------------------------------------------------------------------------------------------------------- >>> 10.0.231.91 storage /data/storage_a/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup N/A Faulty N/A N/A >>> 10.0.231.91 storage /data/storage_c/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup N/A Faulty N/A N/A >>> 10.0.231.91 storage /data/storage_b/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup N/A Faulty N/A N/A >>> 10.0.231.92 storage /data/storage_b/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup N/A Faulty N/A N/A >>> 10.0.231.92 storage /data/storage_a/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup N/A Faulty N/A N/A >>> 10.0.231.92 storage /data/storage_c/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup N/A Faulty N/A N/A >>> 10.0.231.93 storage /data/storage_c/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup N/A Faulty N/A N/A >>> 10.0.231.93 storage /data/storage_b/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup N/A Faulty N/A N/A >>> 10.0.231.93 storage /data/storage_a/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup N/A Faulty N/A N/A >>> >>> [root at storage01 ~]# gluster volume geo-replication storage geoaccount at 10.0.231.81::pcic-backup stop >>> Stopping geo-replication session between storage &??geoaccount at 10.0.231.81::pcic-backup has been successful >>> >>> >>> I went through the gsyncd logs and see it attempts to go back through the changlogs - which would make sense - but fails: >>> >>> [2021-03-10 19:18:42.165807] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] >>> [2021-03-10 19:18:42.166136] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_a/storage}, {slave_node=10.0.231.81}] >>> [2021-03-10 19:18:42.167829] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_c/storage}, {slave_node=10.0.231.82}] >>> [2021-03-10 19:18:42.172343] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] >>> [2021-03-10 19:18:42.172580] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_b/storage}, {slave_node=10.0.231.82}] >>> [2021-03-10 19:18:42.235574] I [resource(worker /data/storage_c/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... >>> [2021-03-10 19:18:42.236613] I [resource(worker /data/storage_a/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... >>> [2021-03-10 19:18:42.238614] I [resource(worker /data/storage_b/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... >>> [2021-03-10 19:18:44.144856] I [resource(worker /data/storage_b/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9059}] >>> [2021-03-10 19:18:44.145065] I [resource(worker /data/storage_b/storage):1116:connect] GLUSTER: Mounting gluster volume locally... >>> [2021-03-10 19:18:44.162873] I [resource(worker /data/storage_a/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9259}] >>> [2021-03-10 19:18:44.163412] I [resource(worker /data/storage_a/storage):1116:connect] GLUSTER: Mounting gluster volume locally... >>> [2021-03-10 19:18:44.167506] I [resource(worker /data/storage_c/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9316}] >>> [2021-03-10 19:18:44.167746] I [resource(worker /data/storage_c/storage):1116:connect] GLUSTER: Mounting gluster volume locally... >>> [2021-03-10 19:18:45.251372] I [resource(worker /data/storage_b/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1062}] >>> [2021-03-10 19:18:45.251583] I [subcmds(worker /data/storage_b/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor >>> [2021-03-10 19:18:45.271950] I [resource(worker /data/storage_c/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1041}] >>> [2021-03-10 19:18:45.272118] I [subcmds(worker /data/storage_c/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor >>> [2021-03-10 19:18:45.275180] I [resource(worker /data/storage_a/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1116}] >>> [2021-03-10 19:18:45.275361] I [subcmds(worker /data/storage_a/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor >>> [2021-03-10 19:18:47.265618] I [master(worker /data/storage_b/storage):1645:register] _GMaster: Working dir [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage}] >>> [2021-03-10 19:18:47.265954] I [resource(worker /data/storage_b/storage):1292:service_loop] GLUSTER: Register time [{time=1615403927}] >>> [2021-03-10 19:18:47.276746] I [gsyncdstatus(worker /data/storage_b/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}] >>> [2021-03-10 19:18:47.281194] I [gsyncdstatus(worker /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}] >>> [2021-03-10 19:18:47.281404] I [master(worker /data/storage_b/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666552, 0)}, {entry_stime=(1614664113, 0)}, {etime=1615403927}] >>> [2021-03-10 19:18:47.285340] I [master(worker /data/storage_c/storage):1645:register] _GMaster: Working dir [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage}] >>> [2021-03-10 19:18:47.285579] I [resource(worker /data/storage_c/storage):1292:service_loop] GLUSTER: Register time [{time=1615403927}] >>> [2021-03-10 19:18:47.287383] I [master(worker /data/storage_a/storage):1645:register] _GMaster: Working dir [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage}] >>> [2021-03-10 19:18:47.287697] I [resource(worker /data/storage_a/storage):1292:service_loop] GLUSTER: Register time [{time=1615403927}] >>> [2021-03-10 19:18:47.298415] I [gsyncdstatus(worker /data/storage_c/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}] >>> [2021-03-10 19:18:47.301342] I [gsyncdstatus(worker /data/storage_a/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}] >>> [2021-03-10 19:18:47.304183] I [gsyncdstatus(worker /data/storage_c/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}] >>> [2021-03-10 19:18:47.304418] I [master(worker /data/storage_c/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666552, 0)}, {entry_stime=(1614664108, 0)}, {etime=1615403927}] >>> [2021-03-10 19:18:47.305294] E [resource(worker /data/storage_c/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}] >>> [2021-03-10 19:18:47.308124] I [gsyncdstatus(worker /data/storage_a/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}] >>> [2021-03-10 19:18:47.308509] I [master(worker /data/storage_a/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666553, 0)}, {entry_stime=(1614664115, 0)}, {etime=1615403927}] >>> [2021-03-10 19:18:47.357470] E [resource(worker /data/storage_b/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}] >>> [2021-03-10 19:18:47.383949] E [resource(worker /data/storage_a/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}] >>> [2021-03-10 19:18:48.255340] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_b/storage}] >>> [2021-03-10 19:18:48.260052] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] >>> [2021-03-10 19:18:48.275651] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_c/storage}] >>> [2021-03-10 19:18:48.278064] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_a/storage}] >>> [2021-03-10 19:18:48.280453] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] >>> [2021-03-10 19:18:48.282274] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] >>> [2021-03-10 19:18:58.275702] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] >>> [2021-03-10 19:18:58.276041] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_b/storage}, {slave_node=10.0.231.82}] >>> [2021-03-10 19:18:58.296252] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] >>> [2021-03-10 19:18:58.296506] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_c/storage}, {slave_node=10.0.231.82}] >>> [2021-03-10 19:18:58.301290] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] >>> [2021-03-10 19:18:58.301521] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_a/storage}, {slave_node=10.0.231.81}] >>> [2021-03-10 19:18:58.345817] I [resource(worker /data/storage_b/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... >>> [2021-03-10 19:18:58.361268] I [resource(worker /data/storage_c/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... >>> [2021-03-10 19:18:58.367985] I [resource(worker /data/storage_a/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... >>> [2021-03-10 19:18:59.115143] I [subcmds(monitor-status):29:subcmd_monitor_status] <top>: Monitor Status Change [{status=Stopped}] >>> >>> It seems like there is an issue selecting the changelogs - perhaps similar to this issue? https://github.com/gluster/glusterfs/issues/1766 >>> >>> [root at storage01 storage_10.0.231.81_pcic-backup]# cat changes-data-storage_a-storage.log >>> [2021-03-10 19:18:45.284764] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_a/storage}, {notify_filter=1}] >>> [2021-03-10 19:18:45.285275] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}] >>> [2021-03-10 19:18:45.285269] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}] >>> [2021-03-10 19:18:45.286615] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 21 >>> [2021-03-10 19:18:47.308607] I [MSGID: 132035] [gf-history-changelog.c:837:gf_history_changelog] 0-gfchangelog: Requesting historical changelogs [{start=1614666553}, {end=1615403927}] >>> [2021-03-10 19:18:47.308659] I [MSGID: 132019] [gf-history-changelog.c:755:gf_changelog_extract_min_max] 0-gfchangelog: changelogs min max [{min=1597342860}, {max=1615403927}, {total_changelogs=1250878}] >>> [2021-03-10 19:18:47.383774] E [MSGID: 132009] [gf-history-changelog.c:941:gf_history_changelog] 0-gfchangelog: wrong result [{for=end}, {start=1615403927}, {idx=1250877}] >>> >>> [root at storage01 storage_10.0.231.81_pcic-backup]# tail -7 changes-data-storage_b-storage.log >>> [2021-03-10 19:18:45.263211] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}] >>> [2021-03-10 19:18:45.263151] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_b/storage}, {notify_filter=1}] >>> [2021-03-10 19:18:45.263294] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}] >>> [2021-03-10 19:18:45.264598] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 23 >>> [2021-03-10 19:18:47.281499] I [MSGID: 132035] [gf-history-changelog.c:837:gf_history_changelog] 0-gfchangelog: Requesting historical changelogs [{start=1614666552}, {end=1615403927}] >>> [2021-03-10 19:18:47.281551] I [MSGID: 132019] [gf-history-changelog.c:755:gf_changelog_extract_min_max] 0-gfchangelog: changelogs min max [{min=1597342860}, {max=1615403927}, {total_changelogs=1258258}] >>> [2021-03-10 19:18:47.357244] E [MSGID: 132009] [gf-history-changelog.c:941:gf_history_changelog] 0-gfchangelog: wrong result [{for=end}, {start=1615403927}, {idx=1258257}] >>> >>> Any ideas on where to debug this? I'd prefer not to have to remove and re-sync everything as there is about 240TB on the cluster... >>> >>> Thanks, >>> -Matthew >>> >>> >>> ________ >>> >>> >>> >>> Community Meeting Calendar: >>> >>> Schedule - >>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>> Bridge: https://meet.google.com/cpu-eiue-hvk >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >>> >>> >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20210311/22fd4e7d/attachment.html>