Strahil Nikolov
2021-Mar-17 04:36 UTC
[Gluster-users] GeoRep Faulty after Gluster 7 to 8 upgrade - gfchangelog: wrong result
Have you verified all steps for creating the geo-replication ? If yes , maybe using "reset-sync-time + delete + create" makes sense.Keep in mind that it will take a long time once the geo-rep is established again. Best Regards,Strahil Nikolov On Tue, Mar 16, 2021 at 22:34, Matthew Benstead<matthewb at uvic.ca> wrote: Thanks Strahil, I wanted to make sure the issue wasn't occurring because there were no new changes to sync from the master volume. So I created some files and restarted the sync, but it had no effect. [root at storage01 ~]# cd /storage2/home/test/ [root at storage01 test]# for nums in {1,2,3,4,5,6,7,8,9,0}; do touch $nums.txt; done [root at storage01 test]# gluster volume geo-replication storage geoaccount at 10.0.231.81::pcic-backup start Starting geo-replication session between storage & geoaccount at 10.0.231.81::pcic-backup has been successful [root at storage01 test]# gluster volume geo-replication status ? MASTER NODE??? MASTER VOL??? MASTER BRICK?????????????? SLAVE USER??? SLAVE??????????????????????????????????????? SLAVE NODE??? STATUS???????????? CRAWL STATUS??? LAST_SYNCED????????? ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 10.0.231.91??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.91??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.91??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.93??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.93??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.93??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.92??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.92??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.92??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? [root at storage01 test]# gluster volume geo-replication status ? MASTER NODE??? MASTER VOL??? MASTER BRICK?????????????? SLAVE USER??? SLAVE??????????????????????????????????????? SLAVE NODE??? STATUS??? CRAWL STATUS??? LAST_SYNCED????????? --------------------------------------------------------------------------------------------------------------------------------------------------------------------- 10.0.231.91??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.91??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.91??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.93??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.93??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.93??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.92??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.92??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.92??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? [root at storage01 test]# gluster volume geo-replication storage geoaccount at 10.0.231.81::pcic-backup stop Stopping geo-replication session between storage & geoaccount at 10.0.231.81::pcic-backup has been successful Still getting the same error about the history crawl failing: [2021-03-16 19:05:05.227677] I [MSGID: 132035] [gf-history-changelog.c:837:gf_history_changelog] 0-gfchangelog: Requesting historical changelogs [{start=1614666552}, {end=1615921505}] [2021-03-16 19:05:05.227733] I [MSGID: 132019] [gf-history-changelog.c:755:gf_changelog_extract_min_max] 0-gfchangelog: changelogs min max [{min=1597342860}, {max=1615921502}, {total_changelogs=1300114}] [2021-03-16 19:05:05.408567] E [MSGID: 132009] [gf-history-changelog.c:941:gf_history_changelog] 0-gfchangelog: wrong result [{for=end}, {start=1615921502}, {idx=1300113}] [2021-03-16 19:05:05.228092] I [resource(worker /data/storage_c/storage):1292:service_loop] GLUSTER: Register time [{time=1615921505}] [2021-03-16 19:05:05.228626] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 124117:140500837320448:1615921505.23 keep_alive(None,) ... [2021-03-16 19:05:05.230076] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 124117:140500837320448:1615921505.23 keep_alive -> 1 [2021-03-16 19:05:05.230693] D [master(worker /data/storage_c/storage):540:crawlwrap] _GMaster: primary master with volume id cf94a8f2-324b-40b3-bf72-c3766100ea99 ... [2021-03-16 19:05:05.237607] I [gsyncdstatus(worker /data/storage_c/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}] [2021-03-16 19:05:05.242046] I [gsyncdstatus(worker /data/storage_c/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}] [2021-03-16 19:05:05.242450] I [master(worker /data/storage_c/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666552, 0)}, {entry_stime=(1614664108, 0)}, {etime=1615921505}] [2021-03-16 19:05:05.244151] E [resource(worker /data/storage_c/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}] [2021-03-16 19:05:05.394129] E [resource(worker /data/storage_a/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}] [2021-03-16 19:05:05.408759] E [resource(worker /data/storage_b/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}] [2021-03-16 19:05:06.158694] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_a/storage}] [2021-03-16 19:05:06.163052] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] [2021-03-16 19:05:06.204464] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_b/storage}] [2021-03-16 19:05:06.208961] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] [2021-03-16 19:05:06.220495] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_c/storage}] [2021-03-16 19:05:06.223947] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] I confirmed NTP is working: pcic-backup02 | CHANGED | rc=0 >> ???? remote?????????? refid????? st t when poll reach?? delay?? offset? jitter ============================================================================= +s216-232-132-95 68.69.221.61???? 2 u?? 29 1024? 377?? 24.141??? 2.457?? 1.081 *yyz-1.ip.0xt.ca 206.108.0.131??? 2 u? 257 1024? 377?? 57.119?? -0.084?? 5.625 +ip102.ip-198-27 192.168.10.254?? 2 u? 189 1024? 377?? 64.227?? -3.012?? 8.867 storage03 | CHANGED | rc=0 >> ???? remote?????????? refid????? st t when poll reach?? delay?? offset? jitter ============================================================================= *198.161.203.36? 128.233.150.93?? 2 u?? 36 1024? 377?? 16.055?? -0.381?? 0.318 +s206-75-147-25. 192.168.10.254?? 2 u? 528 1024? 377?? 23.648?? -6.196?? 4.803 +time.cloudflare 10.69.8.80?????? 3 u? 121 1024? 377??? 2.408??? 0.507?? 0.791 storage02 | CHANGED | rc=0 >> ???? remote?????????? refid????? st t when poll reach?? delay?? offset? jitter ============================================================================= *198.161.203.36? 128.233.150.93?? 2 u? 918 1024? 377?? 15.952??? 0.226?? 0.197 +linuxgeneration 16.164.40.197??? 2 u?? 88 1024? 377?? 62.692?? -1.160?? 2.007 +dns3.switch.ca? 206.108.0.131??? 2 u? 857 1024? 377?? 27.315??? 0.778?? 0.483 storage01 | CHANGED | rc=0 >> ???? remote?????????? refid????? st t when poll reach?? delay?? offset? jitter ============================================================================= +198.161.203.36? 128.233.150.93?? 2 u? 121 1024? 377?? 16.069??? 1.016?? 0.195 +zero.gotroot.ca 30.114.5.31????? 2 u? 543 1024? 377??? 5.106?? -2.462?? 4.923 *ntp3.torix.ca?? .PTP0.?????????? 1 u? 300 1024? 377?? 54.010??? 2.421? 15.182 pcic-backup01 | CHANGED | rc=0 >> ???? remote?????????? refid????? st t when poll reach?? delay?? offset? jitter ============================================================================= *dns3.switch.ca? 206.108.0.131??? 2 u? 983 1024? 377?? 26.990??? 0.523?? 1.389 +dns2.switch.ca? 206.108.0.131??? 2 u? 689 1024? 377?? 26.975?? -0.257?? 0.467 +64.ip-54-39-23. 214.176.184.39?? 2 u? 909 1024? 377?? 64.262?? -0.604?? 6.129 And everything is working on the same version of gluster: pcic-backup02 | CHANGED | rc=0 >> glusterfs 8.3 pcic-backup01 | CHANGED | rc=0 >> glusterfs 8.3 storage02 | CHANGED | rc=0 >> glusterfs 8.3 storage01 | CHANGED | rc=0 >> glusterfs 8.3 storage03 | CHANGED | rc=0 >> glusterfs 8.3 SSH works, and the backup user/group is configured with mountbroker: [root at storage01 ~]# ssh -i /root/.ssh/id_rsa geoaccount at 10.0.231.81 uname -a Linux pcic-backup01 3.10.0-1160.15.2.el7.x86_64 #1 SMP Wed Feb 3 15:06:38 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux [root at storage01 ~]# ssh -i /root/.ssh/id_rsa geoaccount at 10.0.231.82 uname -a Linux pcic-backup02 3.10.0-1160.15.2.el7.x86_64 #1 SMP Wed Feb 3 15:06:38 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux [root at pcic-backup01 ~]# grep geo /etc/passwd geoaccount:x:1000:1000::/home/geoaccount:/bin/bash [root at pcic-backup01 ~]# grep geo /etc/group geogroup:x:1000:geoaccount geoaccount:x:1001:geoaccount [root at pcic-backup01 ~]# gluster-mountbroker status +-------------+-------------+---------------------------+--------------+--------------------------+ |???? NODE??? | NODE STATUS |???????? MOUNT ROOT??????? |??? GROUP???? |????????? USERS?????????? | +-------------+-------------+---------------------------+--------------+--------------------------+ | 10.0.231.82 |????????? UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup)? | |? localhost? |????????? UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup)? | +-------------+-------------+---------------------------+--------------+--------------------------+ So, then if I'm going to have to resync, what is the best way to do this? With delete or delete reset-sync-time ???https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/html/administration_guide/sect-starting_geo-replication#Deleting_a_Geo-replication_Session Erasing the index? So I don't have to transfer the files again that are already on the backup? - https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.1/html/administration_guide/sect-troubleshooting_geo-replication#Synchronization_Is_Not_Complete - https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Administrator%20Guide/Geo%20Replication/#best-practices Is it possible to use the special-sync-mode? option from here:https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/html/administration_guide/sect-disaster_recovery Thoughts? Thanks, ?-Matthew -- On 3/12/21 3:31 PM, Strahil Nikolov wrote: Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information. Usually, when I'm stuck - I just start over. For example, check the prerequisites: - Is ssh available (no firewall blocking) - Is time sync enabled (ntp/chrony) - Is DNS ok on all hosts (including PTR records) - Is the gluster version the same on all nodes (primary & secondary) Then start over as if the geo rep was never existing. For example , stop it and start over with the secondary nodes's checks (mountbroker, user, group) . Most probably somwthing will come up and you will fix it. In worst case scenario, you will need to clean ip the geo-rep and start fresh. Best Regards, Strahil Nikolov On Fri, Mar 12, 2021 at 20:01, Matthew Benstead <matthewb at uvic.ca> wrote: Hi Strahil, Yes, SELinux was put into permissive mode on the secondary nodes as well: [root at pcic-backup01 ~]# sestatus | egrep -i? "^SELinux status|mode" SELinux status:???????????????? enabled Current mode:?????????????????? permissive Mode from config file:????????? enforcing [root at pcic-backup02 ~]# sestatus | egrep -i? "^SELinux status|mode" SELinux status:???????????????? enabled Current mode:?????????????????? permissive Mode from config file:????????? enforcing The secondary server logs didn't show anything interesting: gsyncd.log: [2021-03-11 19:15:28.81820] I [resource(slave 10.0.231.92/data/storage_c/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:28.101819] I [resource(slave 10.0.231.91/data/storage_a/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:28.107012] I [resource(slave 10.0.231.93/data/storage_c/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:28.124567] I [resource(slave 10.0.231.93/data/storage_b/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:28.128145] I [resource(slave 10.0.231.93/data/storage_a/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:29.425739] I [resource(slave 10.0.231.93/data/storage_c/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.3184}] [2021-03-11 19:15:29.427448] I [resource(slave 10.0.231.93/data/storage_c/storage):1166:service_loop] GLUSTER: slave listening [2021-03-11 19:15:29.433340] I [resource(slave 10.0.231.93/data/storage_b/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.3083}] [2021-03-11 19:15:29.434452] I [resource(slave 10.0.231.91/data/storage_a/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.3321}] [2021-03-11 19:15:29.434314] I [resource(slave 10.0.231.93/data/storage_b/storage):1166:service_loop] GLUSTER: slave listening [2021-03-11 19:15:29.435575] I [resource(slave 10.0.231.91/data/storage_a/storage):1166:service_loop] GLUSTER: slave listening [2021-03-11 19:15:29.439769] I [resource(slave 10.0.231.92/data/storage_c/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.3576}] [2021-03-11 19:15:29.440998] I [resource(slave 10.0.231.92/data/storage_c/storage):1166:service_loop] GLUSTER: slave listening [2021-03-11 19:15:29.454745] I [resource(slave 10.0.231.93/data/storage_a/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.3262}] [2021-03-11 19:15:29.456192] I [resource(slave 10.0.231.93/data/storage_a/storage):1166:service_loop] GLUSTER: slave listening [2021-03-11 19:15:32.594865] I [repce(slave 10.0.231.92/data/storage_c/storage):96:service_loop] RepceServer: terminating on reaching EOF. [2021-03-11 19:15:32.607815] I [repce(slave 10.0.231.93/data/storage_c/storage):96:service_loop] RepceServer: terminating on reaching EOF. [2021-03-11 19:15:32.647663] I [repce(slave 10.0.231.93/data/storage_b/storage):96:service_loop] RepceServer: terminating on reaching EOF. [2021-03-11 19:15:32.656280] I [repce(slave 10.0.231.91/data/storage_a/storage):96:service_loop] RepceServer: terminating on reaching EOF. [2021-03-11 19:15:32.668299] I [repce(slave 10.0.231.93/data/storage_a/storage):96:service_loop] RepceServer: terminating on reaching EOF. [2021-03-11 19:15:44.260689] I [resource(slave 10.0.231.92/data/storage_c/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:44.271457] I [resource(slave 10.0.231.93/data/storage_c/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:44.271883] I [resource(slave 10.0.231.93/data/storage_b/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:44.279670] I [resource(slave 10.0.231.91/data/storage_a/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:44.284261] I [resource(slave 10.0.231.93/data/storage_a/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:45.614280] I [resource(slave 10.0.231.93/data/storage_b/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.3419}] [2021-03-11 19:15:45.615622] I [resource(slave 10.0.231.93/data/storage_b/storage):1166:service_loop] GLUSTER: slave listening [2021-03-11 19:15:45.617986] I [resource(slave 10.0.231.93/data/storage_c/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.3461}] [2021-03-11 19:15:45.618180] I [resource(slave 10.0.231.91/data/storage_a/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.3380}] [2021-03-11 19:15:45.619539] I [resource(slave 10.0.231.91/data/storage_a/storage):1166:service_loop] GLUSTER: slave listening [2021-03-11 19:15:45.618999] I [resource(slave 10.0.231.93/data/storage_c/storage):1166:service_loop] GLUSTER: slave listening [2021-03-11 19:15:45.620843] I [resource(slave 10.0.231.93/data/storage_a/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.3361}] [2021-03-11 19:15:45.621347] I [resource(slave 10.0.231.92/data/storage_c/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.3604}] [2021-03-11 19:15:45.622179] I [resource(slave 10.0.231.93/data/storage_a/storage):1166:service_loop] GLUSTER: slave listening [2021-03-11 19:15:45.622541] I [resource(slave 10.0.231.92/data/storage_c/storage):1166:service_loop] GLUSTER: slave listening [2021-03-11 19:15:47.626054] I [repce(slave 10.0.231.91/data/storage_a/storage):96:service_loop] RepceServer: terminating on reaching EOF. [2021-03-11 19:15:48.778399] I [repce(slave 10.0.231.93/data/storage_c/storage):96:service_loop] RepceServer: terminating on reaching EOF. [2021-03-11 19:15:48.778491] I [repce(slave 10.0.231.92/data/storage_c/storage):96:service_loop] RepceServer: terminating on reaching EOF. [2021-03-11 19:15:48.796854] I [repce(slave 10.0.231.93/data/storage_a/storage):96:service_loop] RepceServer: terminating on reaching EOF. [2021-03-11 19:15:48.800697] I [repce(slave 10.0.231.93/data/storage_b/storage):96:service_loop] RepceServer: terminating on reaching EOF. The mnt geo-rep files were also uninteresting: [2021-03-11 19:15:28.250150] I [MSGID: 100030] [glusterfsd.c:2689:main] 0-/usr/sbin/glusterfs: Started running version [{arg=/usr/sbin/glusterfs}, {version=8.3}, {cmdlinestr=/usr/sbin/glusterfs --user-map-root=g eoaccount --aux-gfid-mount --acl --log-level=INFO--log-file=/var/log/glusterfs/geo-replication-slaves/storage_10.0.231.81_pcic-backup/mnt-10.0.231.93-data-storage_b-storage.log --volfile-server=localhost --volf ile-id=pcic-backup --client-pid=-1 /var/mountbroker-root/user1000/mtpt-geoaccount-GmVoUI}] [2021-03-11 19:15:28.253485] I [glusterfsd.c:2424:daemonize] 0-glusterfs: Pid of current running process is 157484 [2021-03-11 19:15:28.267911] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=0}] [2021-03-11 19:15:28.267984] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=1}] [2021-03-11 19:15:28.268371] I [glusterfsd-mgmt.c:2170:mgmt_getspec_cbk] 0-glusterfs: Received list of available volfile servers: 10.0.231.82:24007 [2021-03-11 19:15:28.271729] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}] [2021-03-11 19:15:28.271762] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}] [2021-03-11 19:15:28.272223] I [MSGID: 114020] [client.c:2315:notify] 0-pcic-backup-client-0: parent translators are ready, attempting connect on transport [] [2021-03-11 19:15:28.275883] I [MSGID: 114020] [client.c:2315:notify] 0-pcic-backup-client-1: parent translators are ready, attempting connect on transport [] [2021-03-11 19:15:28.276154] I [rpc-clnt.c:1975:rpc_clnt_reconfig] 0-pcic-backup-client-0: changing port to 49153 (from 0) [2021-03-11 19:15:28.276193] I [socket.c:849:__socket_shutdown] 0-pcic-backup-client-0: intentional socket shutdown(13) Final graph: ... +------------------------------------------------------------------------------+ [2021-03-11 19:15:28.282144] I [socket.c:849:__socket_shutdown] 0-pcic-backup-client-1: intentional socket shutdown(15) [2021-03-11 19:15:28.286536] I [MSGID: 114057] [client-handshake.c:1128:select_server_supported_programs] 0-pcic-backup-client-0: Using Program [{Program-name=GlusterFS 4.x v1}, {Num=1298437}, {Version=400}] [2021-03-11 19:15:28.287208] I [MSGID: 114046] [client-handshake.c:857:client_setvolume_cbk] 0-pcic-backup-client-0: Connected, attached to remote volume [{conn-name=pcic-backup-client-0}, {remote_subvol=/data/brick}] [2021-03-11 19:15:28.290162] I [MSGID: 114057] [client-handshake.c:1128:select_server_supported_programs] 0-pcic-backup-client-1: Using Program [{Program-name=GlusterFS 4.x v1}, {Num=1298437}, {Version=400}] [2021-03-11 19:15:28.291122] I [MSGID: 114046] [client-handshake.c:857:client_setvolume_cbk] 0-pcic-backup-client-1: Connected, attached to remote volume [{conn-name=pcic-backup-client-1}, {remote_subvol=/data/brick}] [2021-03-11 19:15:28.292703] I [fuse-bridge.c:5300:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.23 [2021-03-11 19:15:28.292730] I [fuse-bridge.c:5926:fuse_graph_sync] 0-fuse: switched to graph 0 [2021-03-11 19:15:32.809518] I [fuse-bridge.c:6242:fuse_thread_proc] 0-fuse: initiating unmount of /var/mountbroker-root/user1000/mtpt-geoaccount-GmVoUI [2021-03-11 19:15:32.810216] W [glusterfsd.c:1439:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7ea5) [0x7ff56b175ea5] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x55664e67db45] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x55664e67d9ab] ) 0-: received signum (15), shutting down [2021-03-11 19:15:32.810253] I [fuse-bridge.c:7074:fini] 0-fuse: Unmounting '/var/mountbroker-root/user1000/mtpt-geoaccount-GmVoUI'. [2021-03-11 19:15:32.810268] I [fuse-bridge.c:7079:fini] 0-fuse: Closing fuse connection to '/var/mountbroker-root/user1000/mtpt-geoaccount-GmVoUI'. I'm really at a loss for where to go from here, it seems like everything is set up correctly, and it has been working well through the 7.x minor versions, but the jump to 8 has broken something... There definitely are lots of changelogs on the servers that fit into the timeframe..... I haven't made any writes to the source volume.... do you think that's the problem? That it needs some new changelog info to sync? I had been holding off making any writes in case I needed to go back to Gluster7.9 - not sure if that's really a good option or not. [root at storage01 changelogs]# for dirs in {a,b,c}; do echo "/data/storage_$dirs/storage/.glusterfs/changelogs"; ls -lh /data/storage_$dirs/storage/.glusterfs/changelogs | head; echo ""; done /data/storage_a/storage/.glusterfs/changelogs total 16G drw-------. 3 root root?? 24 Mar? 9 11:34 2021 -rw-r--r--. 1 root root?? 51 Mar 12 09:50 CHANGELOG -rw-r--r--. 1 root root? 13K Aug 13? 2020 CHANGELOG.1597343197 -rw-r--r--. 1 root root? 51K Aug 13? 2020 CHANGELOG.1597343212 -rw-r--r--. 1 root root? 86K Aug 13? 2020 CHANGELOG.1597343227 -rw-r--r--. 1 root root? 99K Aug 13? 2020 CHANGELOG.1597343242 -rw-r--r--. 1 root root? 69K Aug 13? 2020 CHANGELOG.1597343257 -rw-r--r--. 1 root root? 69K Aug 13? 2020 CHANGELOG.1597343272 -rw-r--r--. 1 root root? 72K Aug 13? 2020 CHANGELOG.1597343287 /data/storage_b/storage/.glusterfs/changelogs total 3.3G drw-------. 3 root root?? 24 Mar? 9 11:34 2021 -rw-r--r--. 1 root root?? 51 Mar 12 09:50 CHANGELOG -rw-r--r--. 1 root root? 13K Aug 13? 2020 CHANGELOG.1597343197 -rw-r--r--. 1 root root? 53K Aug 13? 2020 CHANGELOG.1597343212 -rw-r--r--. 1 root root? 89K Aug 13? 2020 CHANGELOG.1597343227 -rw-r--r--. 1 root root? 89K Aug 13? 2020 CHANGELOG.1597343242 -rw-r--r--. 1 root root? 69K Aug 13? 2020 CHANGELOG.1597343257 -rw-r--r--. 1 root root? 71K Aug 13? 2020 CHANGELOG.1597343272 -rw-r--r--. 1 root root? 86K Aug 13? 2020 CHANGELOG.1597343287 /data/storage_c/storage/.glusterfs/changelogs total 9.6G drw-------. 3 root root?? 16 Mar? 9 11:34 2021 -rw-r--r--. 1 root root?? 51 Mar 12 09:50 CHANGELOG -rw-r--r--. 1 root root? 16K Aug 13? 2020 CHANGELOG.1597343199 -rw-r--r--. 1 root root? 71K Aug 13? 2020 CHANGELOG.1597343214 -rw-r--r--. 1 root root 122K Aug 13? 2020 CHANGELOG.1597343229 -rw-r--r--. 1 root root? 73K Aug 13? 2020 CHANGELOG.1597343244 -rw-r--r--. 1 root root 100K Aug 13? 2020 CHANGELOG.1597343259 -rw-r--r--. 1 root root? 95K Aug 13? 2020 CHANGELOG.1597343274 -rw-r--r--. 1 root root? 92K Aug 13? 2020 CHANGELOG.1597343289 [root at storage01 changelogs]# for dirs in {a,b,c}; do echo "/data/storage_$dirs/storage/.glusterfs/changelogs"; ls -lh /data/storage_$dirs/storage/.glusterfs/changelogs | tail; echo ""; done /data/storage_a/storage/.glusterfs/changelogs -rw-r--r--. 1 root root?? 92 Mar? 1 21:33 CHANGELOG.1614663193 -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663731 -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663760 -rw-r--r--. 1 root root? 511 Mar? 1 21:47 CHANGELOG.1614664043 -rw-r--r--. 1 root root? 536 Mar? 1 21:48 CHANGELOG.1614664101 -rw-r--r--. 1 root root 2.8K Mar? 1 21:48 CHANGELOG.1614664116 -rw-r--r--. 1 root root?? 92 Mar? 1 22:20 CHANGELOG.1614666061 -rw-r--r--. 1 root root?? 92 Mar? 1 22:29 CHANGELOG.1614666554 drw-------. 2 root root?? 10 May? 7? 2020 csnap drw-------. 2 root root?? 38 Aug 13? 2020 htime /data/storage_b/storage/.glusterfs/changelogs -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663731 -rw-r--r--. 1 root root? 480 Mar? 1 21:42 CHANGELOG.1614663745 -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663760 -rw-r--r--. 1 root root? 524 Mar? 1 21:47 CHANGELOG.1614664043 -rw-r--r--. 1 root root? 495 Mar? 1 21:48 CHANGELOG.1614664100 -rw-r--r--. 1 root root 1.6K Mar? 1 21:48 CHANGELOG.1614664114 -rw-r--r--. 1 root root?? 92 Mar? 1 22:20 CHANGELOG.1614666060 -rw-r--r--. 1 root root?? 92 Mar? 1 22:29 CHANGELOG.1614666553 drw-------. 2 root root?? 10 May? 7? 2020 csnap drw-------. 2 root root?? 38 Aug 13? 2020 htime /data/storage_c/storage/.glusterfs/changelogs -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663738 -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663753 -rw-r--r--. 1 root root? 395 Mar? 1 21:47 CHANGELOG.1614664051 -rw-r--r--. 1 root root? 316 Mar? 1 21:48 CHANGELOG.1614664094 -rw-r--r--. 1 root root 1.2K Mar? 1 21:48 CHANGELOG.1614664109 -rw-r--r--. 1 root root? 174 Mar? 1 21:48 CHANGELOG.1614664123 -rw-r--r--. 1 root root?? 92 Mar? 1 22:20 CHANGELOG.1614666061 -rw-r--r--. 1 root root?? 92 Mar? 1 22:29 CHANGELOG.1614666553 drw-------. 2 root root??? 6 May? 7? 2020 csnap drw-------. 2 root root?? 30 Aug 13? 2020 htime [root at storage02 ~]# for dirs in {a,b,c}; do echo "/data/storage_$dirs/storage/.glusterfs/changelogs"; ls -lh /data/storage_$dirs/storage/.glusterfs/changelogs | head; echo ""; done /data/storage_a/storage/.glusterfs/changelogs total 9.6G drw-------. 3 root root?? 24 Mar? 9 11:34 2021 -rw-r--r--. 1 root root?? 51 Mar 12 09:50 CHANGELOG -rw-r--r--. 1 root root 4.2K Aug 13? 2020 CHANGELOG.1597343193 -rw-r--r--. 1 root root? 32K Aug 13? 2020 CHANGELOG.1597343208 -rw-r--r--. 1 root root 107K Aug 13? 2020 CHANGELOG.1597343223 -rw-r--r--. 1 root root 120K Aug 13? 2020 CHANGELOG.1597343238 -rw-r--r--. 1 root root? 72K Aug 13? 2020 CHANGELOG.1597343253 -rw-r--r--. 1 root root 111K Aug 13? 2020 CHANGELOG.1597343268 -rw-r--r--. 1 root root? 91K Aug 13? 2020 CHANGELOG.1597343283 /data/storage_b/storage/.glusterfs/changelogs total 16G drw-------. 3 root root?? 24 Mar? 9 11:34 2021 -rw-r--r--. 1 root root?? 51 Mar 12 09:50 CHANGELOG -rw-r--r--. 1 root root 3.9K Aug 13? 2020 CHANGELOG.1597343193 -rw-r--r--. 1 root root? 35K Aug 13? 2020 CHANGELOG.1597343208 -rw-r--r--. 1 root root? 85K Aug 13? 2020 CHANGELOG.1597343223 -rw-r--r--. 1 root root 103K Aug 13? 2020 CHANGELOG.1597343238 -rw-r--r--. 1 root root? 70K Aug 13? 2020 CHANGELOG.1597343253 -rw-r--r--. 1 root root? 72K Aug 13? 2020 CHANGELOG.1597343268 -rw-r--r--. 1 root root? 73K Aug 13? 2020 CHANGELOG.1597343283 /data/storage_c/storage/.glusterfs/changelogs total 3.3G drw-------. 3 root root?? 16 Mar? 9 11:34 2021 -rw-r--r--. 1 root root?? 51 Mar 12 09:51 CHANGELOG -rw-r--r--. 1 root root? 21K Aug 13? 2020 CHANGELOG.1597343202 -rw-r--r--. 1 root root? 75K Aug 13? 2020 CHANGELOG.1597343217 -rw-r--r--. 1 root root? 92K Aug 13? 2020 CHANGELOG.1597343232 -rw-r--r--. 1 root root? 77K Aug 13? 2020 CHANGELOG.1597343247 -rw-r--r--. 1 root root? 66K Aug 13? 2020 CHANGELOG.1597343262 -rw-r--r--. 1 root root? 84K Aug 13? 2020 CHANGELOG.1597343277 -rw-r--r--. 1 root root? 81K Aug 13? 2020 CHANGELOG.1597343292 [root at storage02 ~]# for dirs in {a,b,c}; do echo "/data/storage_$dirs/storage/.glusterfs/changelogs"; ls -lh /data/storage_$dirs/storage/.glusterfs/changelogs | tail; echo ""; done /data/storage_a/storage/.glusterfs/changelogs -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663734 -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663749 -rw-r--r--. 1 root root? 395 Mar? 1 21:47 CHANGELOG.1614664052 -rw-r--r--. 1 root root? 316 Mar? 1 21:48 CHANGELOG.1614664096 -rw-r--r--. 1 root root 1.2K Mar? 1 21:48 CHANGELOG.1614664111 -rw-r--r--. 1 root root? 174 Mar? 1 21:48 CHANGELOG.1614664126 -rw-r--r--. 1 root root?? 92 Mar? 1 22:20 CHANGELOG.1614666056 -rw-r--r--. 1 root root?? 92 Mar? 1 22:29 CHANGELOG.1614666560 drw-------. 2 root root?? 10 May? 7? 2020 csnap drw-------. 2 root root?? 38 Aug 13? 2020 htime /data/storage_b/storage/.glusterfs/changelogs -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663735 -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663749 -rw-r--r--. 1 root root? 511 Mar? 1 21:47 CHANGELOG.1614664052 -rw-r--r--. 1 root root? 316 Mar? 1 21:48 CHANGELOG.1614664096 -rw-r--r--. 1 root root 1.8K Mar? 1 21:48 CHANGELOG.1614664111 -rw-r--r--. 1 root root 1.4K Mar? 1 21:48 CHANGELOG.1614664126 -rw-r--r--. 1 root root?? 92 Mar? 1 22:20 CHANGELOG.1614666060 -rw-r--r--. 1 root root?? 92 Mar? 1 22:29 CHANGELOG.1614666556 drw-------. 2 root root?? 10 May? 7? 2020 csnap drw-------. 2 root root?? 38 Aug 13? 2020 htime /data/storage_c/storage/.glusterfs/changelogs -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663738 -rw-r--r--. 1 root root? 521 Mar? 1 21:42 CHANGELOG.1614663752 -rw-r--r--. 1 root root? 524 Mar? 1 21:47 CHANGELOG.1614664042 -rw-r--r--. 1 root root?? 92 Mar? 1 21:47 CHANGELOG.1614664057 -rw-r--r--. 1 root root? 536 Mar? 1 21:48 CHANGELOG.1614664102 -rw-r--r--. 1 root root 1.6K Mar? 1 21:48 CHANGELOG.1614664117 -rw-r--r--. 1 root root?? 92 Mar? 1 22:20 CHANGELOG.1614666057 -rw-r--r--. 1 root root?? 92 Mar? 1 22:29 CHANGELOG.1614666550 drw-------. 2 root root??? 6 May? 7? 2020 csnap drw-------. 2 root root?? 30 Aug 13? 2020 htime [root at storage03 ~]# for dirs in {a,b,c}; do echo "/data/storage_$dirs/storage/.glusterfs/changelogs"; ls -lh /data/storage_$dirs/storage/.glusterfs/changelogs | head; echo ""; done /data/storage_a/storage/.glusterfs/changelogs total 3.4G drw-------. 3 root root?? 24 Mar? 9 11:34 2021 -rw-r--r--. 1 root root?? 51 Mar 12 09:50 CHANGELOG -rw-r--r--. 1 root root? 19K Aug 13? 2020 CHANGELOG.1597343201 -rw-r--r--. 1 root root? 66K Aug 13? 2020 CHANGELOG.1597343215 -rw-r--r--. 1 root root? 91K Aug 13? 2020 CHANGELOG.1597343230 -rw-r--r--. 1 root root? 82K Aug 13? 2020 CHANGELOG.1597343245 -rw-r--r--. 1 root root? 64K Aug 13? 2020 CHANGELOG.1597343259 -rw-r--r--. 1 root root? 75K Aug 13? 2020 CHANGELOG.1597343274 -rw-r--r--. 1 root root? 81K Aug 13? 2020 CHANGELOG.1597343289 /data/storage_b/storage/.glusterfs/changelogs total 9.6G drw-------. 3 root root?? 24 Mar? 9 11:34 2021 -rw-r--r--. 1 root root?? 51 Mar 12 09:51 CHANGELOG -rw-r--r--. 1 root root? 19K Aug 13? 2020 CHANGELOG.1597343201 -rw-r--r--. 1 root root? 80K Aug 13? 2020 CHANGELOG.1597343215 -rw-r--r--. 1 root root 119K Aug 13? 2020 CHANGELOG.1597343230 -rw-r--r--. 1 root root? 65K Aug 13? 2020 CHANGELOG.1597343244 -rw-r--r--. 1 root root 100K Aug 13? 2020 CHANGELOG.1597343259 -rw-r--r--. 1 root root? 95K Aug 13? 2020 CHANGELOG.1597343274 -rw-r--r--. 1 root root? 92K Aug 13? 2020 CHANGELOG.1597343289 /data/storage_c/storage/.glusterfs/changelogs total 16G drw-------. 3 root root?? 16 Mar? 9 11:34 2021 -rw-r--r--. 1 root root?? 51 Mar 12 09:51 CHANGELOG -rw-r--r--. 1 root root 3.9K Aug 13? 2020 CHANGELOG.1597343193 -rw-r--r--. 1 root root? 35K Aug 13? 2020 CHANGELOG.1597343208 -rw-r--r--. 1 root root? 85K Aug 13? 2020 CHANGELOG.1597343223 -rw-r--r--. 1 root root 103K Aug 13? 2020 CHANGELOG.1597343238 -rw-r--r--. 1 root root? 70K Aug 13? 2020 CHANGELOG.1597343253 -rw-r--r--. 1 root root? 71K Aug 13? 2020 CHANGELOG.1597343268 -rw-r--r--. 1 root root? 73K Aug 13? 2020 CHANGELOG.1597343283 [root at storage03 ~]# for dirs in {a,b,c}; do echo "/data/storage_$dirs/storage/.glusterfs/changelogs"; ls -lh /data/storage_$dirs/storage/.glusterfs/changelogs | tail; echo ""; done /data/storage_a/storage/.glusterfs/changelogs -rw-r--r--. 1 root root?? 92 Mar? 1 21:33 CHANGELOG.1614663183 -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663740 -rw-r--r--. 1 root root? 521 Mar? 1 21:42 CHANGELOG.1614663755 -rw-r--r--. 1 root root? 524 Mar? 1 21:47 CHANGELOG.1614664049 -rw-r--r--. 1 root root 1.9K Mar? 1 21:48 CHANGELOG.1614664106 -rw-r--r--. 1 root root? 174 Mar? 1 21:48 CHANGELOG.1614664121 -rw-r--r--. 1 root root?? 92 Mar? 1 22:20 CHANGELOG.1614666051 -rw-r--r--. 1 root root?? 92 Mar? 1 22:29 CHANGELOG.1614666559 drw-------. 2 root root?? 10 May? 7? 2020 csnap drw-------. 2 root root?? 38 Aug 13? 2020 htime /data/storage_b/storage/.glusterfs/changelogs -rw-r--r--. 1 root root? 474 Mar? 1 21:33 CHANGELOG.1614663182 -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663739 -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663753 -rw-r--r--. 1 root root? 395 Mar? 1 21:47 CHANGELOG.1614664049 -rw-r--r--. 1 root root 1.4K Mar? 1 21:48 CHANGELOG.1614664106 -rw-r--r--. 1 root root? 174 Mar? 1 21:48 CHANGELOG.1614664120 -rw-r--r--. 1 root root?? 92 Mar? 1 22:20 CHANGELOG.1614666063 -rw-r--r--. 1 root root?? 92 Mar? 1 22:29 CHANGELOG.1614666557 drw-------. 2 root root?? 10 May? 7? 2020 csnap drw-------. 2 root root?? 38 Aug 13? 2020 htime /data/storage_c/storage/.glusterfs/changelogs -rw-r--r--. 1 root root? 468 Mar? 1 21:33 CHANGELOG.1614663183 -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663740 -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663754 -rw-r--r--. 1 root root? 511 Mar? 1 21:47 CHANGELOG.1614664048 -rw-r--r--. 1 root root 2.0K Mar? 1 21:48 CHANGELOG.1614664105 -rw-r--r--. 1 root root 1.4K Mar? 1 21:48 CHANGELOG.1614664120 -rw-r--r--. 1 root root?? 92 Mar? 1 22:20 CHANGELOG.1614666063 -rw-r--r--. 1 root root?? 92 Mar? 1 22:29 CHANGELOG.1614666556 drw-------. 2 root root??? 6 May? 7? 2020 csnap drw-------. 2 root root?? 30 Aug 13? 2020 htime Thanks, ?-Matthew -- Matthew Benstead System Administrator Pacific Climate Impacts Consortium University of Victoria, UH1 PO Box 1800, STN CSC Victoria, BC, V8W 2Y2 Phone: +1-250-721-8432 Email: matthewb at uvic.ca On 3/11/21 11:37 PM, Strahil Nikolov wrote: Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information. Have you checked the secondary volume nodes' logs & SELINUX status ? Best Regards, Strahil Nikolov On Thu, Mar 11, 2021 at 21:36, Matthew Benstead <matthewb at uvic.ca> wrote: Hi Strahil, It looks like perhaps the changelog_log_level and log_level options? I've set them to debug: [root at storage01 ~]# gluster volume geo-replication storage geoaccount at 10.0.231.81::pcic-backup config | egrep -i "log_level" changelog_log_level:INFO cli_log_level:INFO gluster_log_level:INFO log_level:INFO slave_gluster_log_level:INFO slave_log_level:INFO [root at storage01 ~]# gluster volume geo-replication storage geoaccount at 10.0.231.81::pcic-backup config changelog_log_level DEBUG geo-replication config updated successfully [root at storage01 ~]# gluster volume geo-replication storage geoaccount at 10.0.231.81::pcic-backup config log_level DEBUG geo-replication config updated successfully Then I restarted geo-replication: [root at storage01 ~]# gluster volume geo-replication storage geoaccount at 10.0.231.81::pcic-backup start Starting geo-replication session between storage & geoaccount at 10.0.231.81::pcic-backup has been successful [root at storage01 ~]# gluster volume geo-replication status ? MASTER NODE??? MASTER VOL??? MASTER BRICK?????????????? SLAVE USER??? SLAVE??????????????????????????????????????? SLAVE NODE??? STATUS???????????? CRAWL STATUS??? LAST_SYNCED????????? ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 10.0.231.91??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.91??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.91??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.92??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.92??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.92??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.93??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.93??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? 10.0.231.93??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Initializing...??? N/A???????????? N/A????????????????? [root at storage01 ~]# gluster volume geo-replication status ? MASTER NODE??? MASTER VOL??? MASTER BRICK?????????????? SLAVE USER??? SLAVE??????????????????????????????????????? SLAVE NODE??? STATUS??? CRAWL STATUS??? LAST_SYNCED????????? --------------------------------------------------------------------------------------------------------------------------------------------------------------------- 10.0.231.91??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.91??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.91??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.92??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.92??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.92??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.93??? storage?????? /data/storage_c/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.93??? storage?????? /data/storage_b/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A????????????????? 10.0.231.93??? storage?????? /data/storage_a/storage??? geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A???????????? N/A? [root at storage01 ~]# gluster volume geo-replication storage geoaccount at 10.0.231.81::pcic-backup stop Stopping geo-replication session between storage & geoaccount at 10.0.231.81::pcic-backup has been successful The changelogs didn't really show anything new around changelog selection: [root at storage01 storage_10.0.231.81_pcic-backup]# cat changes-data-storage_a-storage.log | egrep "2021-03-11" [2021-03-11 19:15:30.552889] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_a/storage}, {notify_filter=1}] [2021-03-11 19:15:30.552893] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=0}] [2021-03-11 19:15:30.552894] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=1}] [2021-03-11 19:15:30.553633] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}] [2021-03-11 19:15:30.553634] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}] [2021-03-11 19:15:30.554236] D [rpcsvc.c:2831:rpcsvc_init] 0-rpc-service: RPC service inited. [2021-03-11 19:15:30.554403] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: GF-DUMP, Num: 123451501, Ver: 1, Port: 0 [2021-03-11 19:15:30.554420] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so [2021-03-11 19:15:30.554933] D [socket.c:4485:socket_init] 0-socket.gfchangelog: disabling nodelay [2021-03-11 19:15:30.554944] D [socket.c:4523:socket_init] 0-socket.gfchangelog: Configured transport.tcp-user-timeout=42 [2021-03-11 19:15:30.554949] D [socket.c:4543:socket_init] 0-socket.gfchangelog: Reconfigured transport.keepalivecnt=9 [2021-03-11 19:15:30.555002] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 23 [2021-03-11 19:15:30.555324] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: LIBGFCHANGELOG REBORP, Num: 1886350951, Ver: 1, Port: 0 [2021-03-11 19:15:30.555345] D [rpc-clnt.c:1020:rpc_clnt_connection_init] 0-gfchangelog: defaulting frame-timeout to 30mins [2021-03-11 19:15:30.555351] D [rpc-clnt.c:1032:rpc_clnt_connection_init] 0-gfchangelog: disable ping-timeout [2021-03-11 19:15:30.555358] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so [2021-03-11 19:15:30.555399] D [socket.c:4485:socket_init] 0-gfchangelog: disabling nodelay [2021-03-11 19:15:30.555406] D [socket.c:4523:socket_init] 0-gfchangelog: Configured transport.tcp-user-timeout=42 [2021-03-11 19:15:32.555711] D [rpc-clnt-ping.c:298:rpc_clnt_start_ping] 0-gfchangelog: ping timeout is 0, returning [2021-03-11 19:15:32.572157] I [MSGID: 132035] [gf-history-changelog.c:837:gf_history_changelog] 0-gfchangelog: Requesting historical changelogs [{start=1614666553}, {end=1615490132}] [2021-03-11 19:15:32.572436] I [MSGID: 132019] [gf-history-changelog.c:755:gf_changelog_extract_min_max] 0-gfchangelog: changelogs min max [{min=1597342860}, {max=1615490121}, {total_changelogs=1256897}] [2021-03-11 19:15:32.621244] E [MSGID: 132009] [gf-history-changelog.c:941:gf_history_changelog] 0-gfchangelog: wrong result [{for=end}, {start=1615490121}, {idx=1256896}] [2021-03-11 19:15:46.733182] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_a/storage}, {notify_filter=1}] [2021-03-11 19:15:46.733316] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=0}] [2021-03-11 19:15:46.733348] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=1}] [2021-03-11 19:15:46.734031] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}] [2021-03-11 19:15:46.734085] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}] [2021-03-11 19:15:46.734591] D [rpcsvc.c:2831:rpcsvc_init] 0-rpc-service: RPC service inited. [2021-03-11 19:15:46.734755] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: GF-DUMP, Num: 123451501, Ver: 1, Port: 0 [2021-03-11 19:15:46.734772] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so [2021-03-11 19:15:46.735256] D [socket.c:4485:socket_init] 0-socket.gfchangelog: disabling nodelay [2021-03-11 19:15:46.735266] D [socket.c:4523:socket_init] 0-socket.gfchangelog: Configured transport.tcp-user-timeout=42 [2021-03-11 19:15:46.735271] D [socket.c:4543:socket_init] 0-socket.gfchangelog: Reconfigured transport.keepalivecnt=9 [2021-03-11 19:15:46.735325] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 21 [2021-03-11 19:15:46.735704] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: LIBGFCHANGELOG REBORP, Num: 1886350951, Ver: 1, Port: 0 [2021-03-11 19:15:46.735721] D [rpc-clnt.c:1020:rpc_clnt_connection_init] 0-gfchangelog: defaulting frame-timeout to 30mins [2021-03-11 19:15:46.735726] D [rpc-clnt.c:1032:rpc_clnt_connection_init] 0-gfchangelog: disable ping-timeout [2021-03-11 19:15:46.735733] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so [2021-03-11 19:15:46.735771] D [socket.c:4485:socket_init] 0-gfchangelog: disabling nodelay [2021-03-11 19:15:46.735778] D [socket.c:4523:socket_init] 0-gfchangelog: Configured transport.tcp-user-timeout=42 [2021-03-11 19:15:47.618464] D [rpc-clnt-ping.c:298:rpc_clnt_start_ping] 0-gfchangelog: ping timeout is 0, returning [root at storage01 storage_10.0.231.81_pcic-backup]# cat changes-data-storage_b-storage.log | egrep "2021-03-11" [2021-03-11 19:15:30.611457] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_b/storage}, {notify_filter=1}] [2021-03-11 19:15:30.611574] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=1}] [2021-03-11 19:15:30.611641] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}] [2021-03-11 19:15:30.611645] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}] [2021-03-11 19:15:30.612325] D [rpcsvc.c:2831:rpcsvc_init] 0-rpc-service: RPC service inited. [2021-03-11 19:15:30.612488] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: GF-DUMP, Num: 123451501, Ver: 1, Port: 0 [2021-03-11 19:15:30.612507] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so [2021-03-11 19:15:30.613005] D [socket.c:4485:socket_init] 0-socket.gfchangelog: disabling nodelay [2021-03-11 19:15:30.613130] D [socket.c:4523:socket_init] 0-socket.gfchangelog: Configured transport.tcp-user-timeout=42 [2021-03-11 19:15:30.613142] D [socket.c:4543:socket_init] 0-socket.gfchangelog: Reconfigured transport.keepalivecnt=9 [2021-03-11 19:15:30.613208] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 22 [2021-03-11 19:15:30.613545] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: LIBGFCHANGELOG REBORP, Num: 1886350951, Ver: 1, Port: 0 [2021-03-11 19:15:30.613567] D [rpc-clnt.c:1020:rpc_clnt_connection_init] 0-gfchangelog: defaulting frame-timeout to 30mins [2021-03-11 19:15:30.613574] D [rpc-clnt.c:1032:rpc_clnt_connection_init] 0-gfchangelog: disable ping-timeout [2021-03-11 19:15:30.613582] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so [2021-03-11 19:15:30.613637] D [socket.c:4485:socket_init] 0-gfchangelog: disabling nodelay [2021-03-11 19:15:30.613654] D [socket.c:4523:socket_init] 0-gfchangelog: Configured transport.tcp-user-timeout=42 [2021-03-11 19:15:32.614273] D [rpc-clnt-ping.c:298:rpc_clnt_start_ping] 0-gfchangelog: ping timeout is 0, returning [2021-03-11 19:15:32.643628] I [MSGID: 132035] [gf-history-changelog.c:837:gf_history_changelog] 0-gfchangelog: Requesting historical changelogs [{start=1614666552}, {end=1615490132}] [2021-03-11 19:15:32.643716] I [MSGID: 132019] [gf-history-changelog.c:755:gf_changelog_extract_min_max] 0-gfchangelog: changelogs min max [{min=1597342860}, {max=1615490123}, {total_changelogs=1264296}] [2021-03-11 19:15:32.700397] E [MSGID: 132009] [gf-history-changelog.c:941:gf_history_changelog] 0-gfchangelog: wrong result [{for=end}, {start=1615490123}, {idx=1264295}] [2021-03-11 19:15:46.832322] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_b/storage}, {notify_filter=1}] [2021-03-11 19:15:46.832394] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=0}] [2021-03-11 19:15:46.832465] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=1}] [2021-03-11 19:15:46.832531] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}] [2021-03-11 19:15:46.833086] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}] [2021-03-11 19:15:46.833648] D [rpcsvc.c:2831:rpcsvc_init] 0-rpc-service: RPC service inited. [2021-03-11 19:15:46.833817] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: GF-DUMP, Num: 123451501, Ver: 1, Port: 0 [2021-03-11 19:15:46.833835] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so [2021-03-11 19:15:46.834368] D [socket.c:4485:socket_init] 0-socket.gfchangelog: disabling nodelay [2021-03-11 19:15:46.834380] D [socket.c:4523:socket_init] 0-socket.gfchangelog: Configured transport.tcp-user-timeout=42 [2021-03-11 19:15:46.834386] D [socket.c:4543:socket_init] 0-socket.gfchangelog: Reconfigured transport.keepalivecnt=9 [2021-03-11 19:15:46.834441] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 23 [2021-03-11 19:15:46.834768] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: LIBGFCHANGELOG REBORP, Num: 1886350951, Ver: 1, Port: 0 [2021-03-11 19:15:46.834789] D [rpc-clnt.c:1020:rpc_clnt_connection_init] 0-gfchangelog: defaulting frame-timeout to 30mins [2021-03-11 19:15:46.834795] D [rpc-clnt.c:1032:rpc_clnt_connection_init] 0-gfchangelog: disable ping-timeout [2021-03-11 19:15:46.834802] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so [2021-03-11 19:15:46.834845] D [socket.c:4485:socket_init] 0-gfchangelog: disabling nodelay [2021-03-11 19:15:46.834853] D [socket.c:4523:socket_init] 0-gfchangelog: Configured transport.tcp-user-timeout=42 [2021-03-11 19:15:47.618476] D [rpc-clnt-ping.c:298:rpc_clnt_start_ping] 0-gfchangelog: ping timeout is 0, returning gsyncd logged a lot but I'm not sure if it's helpful: [2021-03-11 19:15:00.41898] D [gsyncd(config-get):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:21.551302] D [gsyncd(config-get):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:21.631470] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:21.718386] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:21.804991] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:26.203999] D [gsyncd(config-get):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:26.284775] D [gsyncd(config-get):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:26.573355] D [gsyncd(config-get):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:26.653752] D [gsyncd(monitor):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:26.756994] D [monitor(monitor):304:distribute] <top>: master bricks: [{'host': '10.0.231.91', 'uuid': 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': '/data/storage_a/storage'}, {'host': '10.0.2 31.92', 'uuid': 'ebbd7b74-3cf8-4752-a71c-b0f0ca86c97d', 'dir': '/data/storage_b/storage'}, {'host': '10.0.231.93', 'uuid': '8b28b331-3780-46bc-9da3-fb27de4ab57b', 'dir': '/data/storage_c/storage'}, {'host': '10. 0.231.92', 'uuid': 'ebbd7b74-3cf8-4752-a71c-b0f0ca86c97d', 'dir': '/data/storage_a/storage'}, {'host': '10.0.231.93', 'uuid': '8b28b331-3780-46bc-9da3-fb27de4ab57b', 'dir': '/data/storage_b/storage'}, {'host': ' 10.0.231.91', 'uuid': 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': '/data/storage_c/storage'}, {'host': '10.0.231.93', 'uuid': '8b28b331-3780-46bc-9da3-fb27de4ab57b', 'dir': '/data/storage_a/storage'}, {'host' : '10.0.231.91', 'uuid': 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': '/data/storage_b/storage'}, {'host': '10.0.231.92', 'uuid': 'ebbd7b74-3cf8-4752-a71c-b0f0ca86c97d', 'dir': '/data/storage_c/storage'}] [2021-03-11 19:15:26.757252] D [monitor(monitor):314:distribute] <top>: slave SSH gateway: geoaccount at 10.0.231.81 [2021-03-11 19:15:27.416235] D [monitor(monitor):334:distribute] <top>: slave bricks: [{'host': '10.0.231.81', 'uuid': 'b88dea4f-31ec-416a-9110-3ccdc3910acd', 'dir': '/data/brick'}, {'host': '10.0.231.82', 'uuid ': 'be50a8de-3934-4fee-a80d-8e2e99017902', 'dir': '/data/brick'}] [2021-03-11 19:15:27.416825] D [syncdutils(monitor):932:is_hot] Volinfo: brickpath: '10.0.231.91:/data/storage_a/storage' [2021-03-11 19:15:27.417273] D [syncdutils(monitor):932:is_hot] Volinfo: brickpath: '10.0.231.91:/data/storage_c/storage' [2021-03-11 19:15:27.417515] D [syncdutils(monitor):932:is_hot] Volinfo: brickpath: '10.0.231.91:/data/storage_b/storage' [2021-03-11 19:15:27.417763] D [monitor(monitor):348:distribute] <top>: worker specs: [({'host': '10.0.231.91', 'uuid': 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': '/data/storage_a/storage'}, ('geoaccount at 10. 0.231.81', 'b88dea4f-31ec-416a-9110-3ccdc3910acd'), '1', False), ({'host': '10.0.231.91', 'uuid': 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': '/data/storage_c/storage'}, ('geoaccount at 10.0.231.82', 'be50a8de-3 934-4fee-a80d-8e2e99017902'), '2', False), ({'host': '10.0.231.91', 'uuid': 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': '/data/storage_b/storage'}, ('geoaccount at 10.0.231.82', 'be50a8de-3934-4fee-a80d-8e2e9901 7902'), '3', False)] [2021-03-11 19:15:27.425009] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_c/storage}, {slave_node=10.0.231.82}] [2021-03-11 19:15:27.426764] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_b/storage}, {slave_node=10.0.231.82}] [2021-03-11 19:15:27.429208] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_a/storage}, {slave_node=10.0.231.81}] [2021-03-11 19:15:27.432280] D [monitor(monitor):195:monitor] Monitor: Worker would mount volume privately [2021-03-11 19:15:27.434195] D [monitor(monitor):195:monitor] Monitor: Worker would mount volume privately [2021-03-11 19:15:27.436584] D [monitor(monitor):195:monitor] Monitor: Worker would mount volume privately [2021-03-11 19:15:27.478806] D [gsyncd(worker /data/storage_c/storage):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:27.478852] D [gsyncd(worker /data/storage_b/storage):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:27.480104] D [gsyncd(worker /data/storage_a/storage):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:27.500456] I [resource(worker /data/storage_c/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... [2021-03-11 19:15:27.501375] I [resource(worker /data/storage_b/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... [2021-03-11 19:15:27.502003] I [resource(worker /data/storage_a/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... [2021-03-11 19:15:27.525511] D [repce(worker /data/storage_a/storage):195:push] RepceClient: call 192117:140572692309824:1615490127.53 __repce_version__() ... [2021-03-11 19:15:27.525582] D [repce(worker /data/storage_b/storage):195:push] RepceClient: call 192115:139891296405312:1615490127.53 __repce_version__() ... [2021-03-11 19:15:27.526089] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 192114:140388828780352:1615490127.53 __repce_version__() ... [2021-03-11 19:15:29.435985] D [repce(worker /data/storage_a/storage):215:__call__] RepceClient: call 192117:140572692309824:1615490127.53 __repce_version__ -> 1.0 [2021-03-11 19:15:29.436213] D [repce(worker /data/storage_a/storage):195:push] RepceClient: call 192117:140572692309824:1615490129.44 version() ... [2021-03-11 19:15:29.437136] D [repce(worker /data/storage_a/storage):215:__call__] RepceClient: call 192117:140572692309824:1615490129.44 version -> 1.0 [2021-03-11 19:15:29.437268] D [repce(worker /data/storage_a/storage):195:push] RepceClient: call 192117:140572692309824:1615490129.44 pid() ... [2021-03-11 19:15:29.437915] D [repce(worker /data/storage_a/storage):215:__call__] RepceClient: call 192117:140572692309824:1615490129.44 pid -> 157321 [2021-03-11 19:15:29.438004] I [resource(worker /data/storage_a/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9359}] [2021-03-11 19:15:29.438072] I [resource(worker /data/storage_a/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:29.494538] D [repce(worker /data/storage_b/storage):215:__call__] RepceClient: call 192115:139891296405312:1615490127.53 __repce_version__ -> 1.0 [2021-03-11 19:15:29.494748] D [repce(worker /data/storage_b/storage):195:push] RepceClient: call 192115:139891296405312:1615490129.49 version() ... [2021-03-11 19:15:29.495290] D [repce(worker /data/storage_b/storage):215:__call__] RepceClient: call 192115:139891296405312:1615490129.49 version -> 1.0 [2021-03-11 19:15:29.495400] D [repce(worker /data/storage_b/storage):195:push] RepceClient: call 192115:139891296405312:1615490129.5 pid() ... [2021-03-11 19:15:29.495872] D [repce(worker /data/storage_b/storage):215:__call__] RepceClient: call 192115:139891296405312:1615490129.5 pid -> 88110 [2021-03-11 19:15:29.495960] I [resource(worker /data/storage_b/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9944}] [2021-03-11 19:15:29.496028] I [resource(worker /data/storage_b/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:29.501255] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 192114:140388828780352:1615490127.53 __repce_version__ -> 1.0 [2021-03-11 19:15:29.501454] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 192114:140388828780352:1615490129.5 version() ... [2021-03-11 19:15:29.502258] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 192114:140388828780352:1615490129.5 version -> 1.0 [2021-03-11 19:15:29.502444] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 192114:140388828780352:1615490129.5 pid() ... [2021-03-11 19:15:29.503140] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 192114:140388828780352:1615490129.5 pid -> 88111 [2021-03-11 19:15:29.503232] I [resource(worker /data/storage_c/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=2.0026}] [2021-03-11 19:15:29.503302] I [resource(worker /data/storage_c/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:29.533899] D [resource(worker /data/storage_a/storage):880:inhibit] DirectMounter: auxiliary glusterfs mount in place [2021-03-11 19:15:29.595736] D [resource(worker /data/storage_b/storage):880:inhibit] DirectMounter: auxiliary glusterfs mount in place [2021-03-11 19:15:29.601110] D [resource(worker /data/storage_c/storage):880:inhibit] DirectMounter: auxiliary glusterfs mount in place [2021-03-11 19:15:30.541542] D [resource(worker /data/storage_a/storage):964:inhibit] DirectMounter: auxiliary glusterfs mount prepared [2021-03-11 19:15:30.541816] I [resource(worker /data/storage_a/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1037}] [2021-03-11 19:15:30.541887] I [subcmds(worker /data/storage_a/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor [2021-03-11 19:15:30.542042] D [master(worker /data/storage_a/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=xsync}] [2021-03-11 19:15:30.542125] D [monitor(monitor):222:monitor] Monitor: worker(/data/storage_a/storage) connected [2021-03-11 19:15:30.543323] D [master(worker /data/storage_a/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changelog}] [2021-03-11 19:15:30.544460] D [master(worker /data/storage_a/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changeloghistory}] [2021-03-11 19:15:30.552103] D [master(worker /data/storage_a/storage):778:setup_working_dir] _GMaster: changelog working dir/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage [2021-03-11 19:15:30.602937] D [resource(worker /data/storage_b/storage):964:inhibit] DirectMounter: auxiliary glusterfs mount prepared [2021-03-11 19:15:30.603117] I [resource(worker /data/storage_b/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1070}] [2021-03-11 19:15:30.603197] I [subcmds(worker /data/storage_b/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor [2021-03-11 19:15:30.603353] D [master(worker /data/storage_b/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=xsync}] [2021-03-11 19:15:30.603338] D [monitor(monitor):222:monitor] Monitor: worker(/data/storage_b/storage) connected [2021-03-11 19:15:30.604620] D [master(worker /data/storage_b/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changelog}] [2021-03-11 19:15:30.605600] D [master(worker /data/storage_b/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changeloghistory}] [2021-03-11 19:15:30.608365] D [resource(worker /data/storage_c/storage):964:inhibit] DirectMounter: auxiliary glusterfs mount prepared [2021-03-11 19:15:30.608534] I [resource(worker /data/storage_c/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1052}] [2021-03-11 19:15:30.608612] I [subcmds(worker /data/storage_c/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor [2021-03-11 19:15:30.608762] D [master(worker /data/storage_c/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=xsync}] [2021-03-11 19:15:30.608779] D [monitor(monitor):222:monitor] Monitor: worker(/data/storage_c/storage) connected [2021-03-11 19:15:30.610033] D [master(worker /data/storage_c/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changelog}] [2021-03-11 19:15:30.610637] D [master(worker /data/storage_b/storage):778:setup_working_dir] _GMaster: changelog working dir/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage [2021-03-11 19:15:30.610970] D [master(worker /data/storage_c/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changeloghistory}] [2021-03-11 19:15:30.616197] D [master(worker /data/storage_c/storage):778:setup_working_dir] _GMaster: changelog working dir/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage [2021-03-11 19:15:31.371265] D [gsyncd(config-get):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:31.451000] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:31.537257] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:31.623800] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:32.555840] D [master(worker /data/storage_a/storage):778:setup_working_dir] _GMaster: changelog working dir/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage [2021-03-11 19:15:32.556051] D [master(worker /data/storage_a/storage):778:setup_working_dir] _GMaster: changelog working dir/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage [2021-03-11 19:15:32.556122] D [master(worker /data/storage_a/storage):778:setup_working_dir] _GMaster: changelog working dir/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage [2021-03-11 19:15:32.556179] I [master(worker /data/storage_a/storage):1645:register] _GMaster: Working dir[{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage}] [2021-03-11 19:15:32.556359] I [resource(worker /data/storage_a/storage):1292:service_loop] GLUSTER: Register time [{time=1615490132}] [2021-03-11 19:15:32.556823] D [repce(worker /data/storage_a/storage):195:push] RepceClient: call 192117:140570487928576:1615490132.56 keep_alive(None,) ... [2021-03-11 19:15:32.558429] D [repce(worker /data/storage_a/storage):215:__call__] RepceClient: call 192117:140570487928576:1615490132.56 keep_alive -> 1 [2021-03-11 19:15:32.558974] D [master(worker /data/storage_a/storage):540:crawlwrap] _GMaster: primary master with volume id cf94a8f2-324b-40b3-bf72-c3766100ea99 ... [2021-03-11 19:15:32.567478] I [gsyncdstatus(worker /data/storage_a/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}] [2021-03-11 19:15:32.571824] I [gsyncdstatus(worker /data/storage_a/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}] [2021-03-11 19:15:32.572052] I [master(worker /data/storage_a/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666553, 0)}, {entry_stime=(1614664115, 0)}, {etime=1615490132}] [2021-03-11 19:15:32.614506] D [master(worker /data/storage_b/storage):778:setup_working_dir] _GMaster: changelog working dir/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage [2021-03-11 19:15:32.614701] D [master(worker /data/storage_b/storage):778:setup_working_dir] _GMaster: changelog working dir/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage [2021-03-11 19:15:32.614788] D [master(worker /data/storage_b/storage):778:setup_working_dir] _GMaster: changelog working dir/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage [2021-03-11 19:15:32.614845] I [master(worker /data/storage_b/storage):1645:register] _GMaster: Working dir[{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage}] [2021-03-11 19:15:32.615000] I [resource(worker /data/storage_b/storage):1292:service_loop] GLUSTER: Register time [{time=1615490132}] [2021-03-11 19:15:32.615586] D [repce(worker /data/storage_b/storage):195:push] RepceClient: call 192115:139889215526656:1615490132.62 keep_alive(None,) ... [2021-03-11 19:15:32.617373] D [repce(worker /data/storage_b/storage):215:__call__] RepceClient: call 192115:139889215526656:1615490132.62 keep_alive -> 1 [2021-03-11 19:15:32.618144] D [master(worker /data/storage_b/storage):540:crawlwrap] _GMaster: primary master with volume id cf94a8f2-324b-40b3-bf72-c3766100ea99 ... [2021-03-11 19:15:32.619323] D [master(worker /data/storage_c/storage):778:setup_working_dir] _GMaster: changelog working dir/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage [2021-03-11 19:15:32.619491] D [master(worker /data/storage_c/storage):778:setup_working_dir] _GMaster: changelog working dir/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage [2021-03-11 19:15:32.619739] D [master(worker /data/storage_c/storage):778:setup_working_dir] _GMaster: changelog working dir/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage [2021-03-11 19:15:32.619863] I [master(worker /data/storage_c/storage):1645:register] _GMaster: Working dir[{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage}] [2021-03-11 19:15:32.620040] I [resource(worker /data/storage_c/storage):1292:service_loop] GLUSTER: Register time [{time=1615490132}] [2021-03-11 19:15:32.620599] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 192114:140386886469376:1615490132.62 keep_alive(None,) ... [2021-03-11 19:15:32.621397] E [resource(worker /data/storage_a/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}] [2021-03-11 19:15:32.622035] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 192114:140386886469376:1615490132.62 keep_alive -> 1 [2021-03-11 19:15:32.622701] D [master(worker /data/storage_c/storage):540:crawlwrap] _GMaster: primary master with volume id cf94a8f2-324b-40b3-bf72-c3766100ea99 ... [2021-03-11 19:15:32.627031] I [gsyncdstatus(worker /data/storage_b/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}] [2021-03-11 19:15:32.643184] I [gsyncdstatus(worker /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}] [2021-03-11 19:15:32.643528] I [master(worker /data/storage_b/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666552, 0)}, {entry_stime=(1614664113, 0)}, {etime=1615490132}] [2021-03-11 19:15:32.645148] I [gsyncdstatus(worker /data/storage_c/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}] [2021-03-11 19:15:32.649631] I [gsyncdstatus(worker /data/storage_c/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}] [2021-03-11 19:15:32.649882] I [master(worker /data/storage_c/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666552, 0)}, {entry_stime=(1614664108, 0)}, {etime=1615490132}] [2021-03-11 19:15:32.650907] E [resource(worker /data/storage_c/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}] [2021-03-11 19:15:32.700489] E [resource(worker /data/storage_b/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}] [2021-03-11 19:15:33.545886] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_a/storage}] [2021-03-11 19:15:33.550487] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] [2021-03-11 19:15:33.606991] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_b/storage}] [2021-03-11 19:15:33.611573] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] [2021-03-11 19:15:33.612337] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_c/storage}] [2021-03-11 19:15:33.615777] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] [2021-03-11 19:15:34.684247] D [gsyncd(config-get):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:34.764971] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:34.851174] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:34.937166] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:36.994502] D [gsyncd(config-get):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:37.73805] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:37.159288] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:37.244153] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:38.916510] D [gsyncd(config-get):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:38.997649] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:39.84816] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:39.172045] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:40.896359] D [gsyncd(config-get):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:40.976135] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:41.62052] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:41.147902] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:42.791997] D [gsyncd(config-get):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:42.871239] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:42.956609] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:43.42473] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:43.566190] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] [2021-03-11 19:15:43.566400] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_a/storage}, {slave_node=10.0.231.81}] [2021-03-11 19:15:43.572240] D [monitor(monitor):195:monitor] Monitor: Worker would mount volume privately [2021-03-11 19:15:43.612744] D [gsyncd(worker /data/storage_a/storage):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:43.625689] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] [2021-03-11 19:15:43.626060] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_b/storage}, {slave_node=10.0.231.82}] [2021-03-11 19:15:43.632287] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] [2021-03-11 19:15:43.632137] D [monitor(monitor):195:monitor] Monitor: Worker would mount volume privately [2021-03-11 19:15:43.632508] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_c/storage}, {slave_node=10.0.231.82}] [2021-03-11 19:15:43.635565] I [resource(worker /data/storage_a/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... [2021-03-11 19:15:43.637835] D [monitor(monitor):195:monitor] Monitor: Worker would mount volume privately [2021-03-11 19:15:43.661304] D [repce(worker /data/storage_a/storage):195:push] RepceClient: call 192535:140367272073024:1615490143.66 __repce_version__() ... [2021-03-11 19:15:43.674499] D [gsyncd(worker /data/storage_b/storage):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:43.680706] D [gsyncd(worker /data/storage_c/storage):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:43.693773] I [resource(worker /data/storage_b/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... [2021-03-11 19:15:43.700957] I [resource(worker /data/storage_c/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... [2021-03-11 19:15:43.717686] D [repce(worker /data/storage_b/storage):195:push] RepceClient: call 192539:139907321804608:1615490143.72 __repce_version__() ... [2021-03-11 19:15:43.725369] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 192541:140653101852480:1615490143.73 __repce_version__() ... [2021-03-11 19:15:44.289117] D [gsyncd(config-get):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:44.375693] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:44.472251] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:44.558429] D [gsyncd(status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:45.619694] D [repce(worker /data/storage_a/storage):215:__call__] RepceClient: call 192535:140367272073024:1615490143.66 __repce_version__ -> 1.0 [2021-03-11 19:15:45.619930] D [repce(worker /data/storage_a/storage):195:push] RepceClient: call 192535:140367272073024:1615490145.62 version() ... [2021-03-11 19:15:45.621191] D [repce(worker /data/storage_a/storage):215:__call__] RepceClient: call 192535:140367272073024:1615490145.62 version -> 1.0 [2021-03-11 19:15:45.621332] D [repce(worker /data/storage_a/storage):195:push] RepceClient: call 192535:140367272073024:1615490145.62 pid() ... [2021-03-11 19:15:45.621859] D [repce(worker /data/storage_a/storage):215:__call__] RepceClient: call 192535:140367272073024:1615490145.62 pid -> 158229 [2021-03-11 19:15:45.621939] I [resource(worker /data/storage_a/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9862}] [2021-03-11 19:15:45.622000] I [resource(worker /data/storage_a/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:45.714468] D [resource(worker /data/storage_a/storage):880:inhibit] DirectMounter: auxiliary glusterfs mount in place [2021-03-11 19:15:45.718441] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 192541:140653101852480:1615490143.73 __repce_version__ -> 1.0 [2021-03-11 19:15:45.718643] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 192541:140653101852480:1615490145.72 version() ... [2021-03-11 19:15:45.719492] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 192541:140653101852480:1615490145.72 version -> 1.0 [2021-03-11 19:15:45.719772] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 192541:140653101852480:1615490145.72 pid() ... [2021-03-11 19:15:45.720202] D [repce(worker /data/storage_b/storage):215:__call__] RepceClient: call 192539:139907321804608:1615490143.72 __repce_version__ -> 1.0 [2021-03-11 19:15:45.720381] D [repce(worker /data/storage_b/storage):195:push] RepceClient: call 192539:139907321804608:1615490145.72 version() ... [2021-03-11 19:15:45.720463] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 192541:140653101852480:1615490145.72 pid -> 88921 [2021-03-11 19:15:45.720694] I [resource(worker /data/storage_c/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=2.0196}] [2021-03-11 19:15:45.720882] I [resource(worker /data/storage_c/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:45.721146] D [repce(worker /data/storage_b/storage):215:__call__] RepceClient: call 192539:139907321804608:1615490145.72 version -> 1.0 [2021-03-11 19:15:45.721271] D [repce(worker /data/storage_b/storage):195:push] RepceClient: call 192539:139907321804608:1615490145.72 pid() ... [2021-03-11 19:15:45.721795] D [repce(worker /data/storage_b/storage):215:__call__] RepceClient: call 192539:139907321804608:1615490145.72 pid -> 88924 [2021-03-11 19:15:45.721911] I [resource(worker /data/storage_b/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=2.0280}] [2021-03-11 19:15:45.721993] I [resource(worker /data/storage_b/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-11 19:15:45.816891] D [resource(worker /data/storage_b/storage):880:inhibit] DirectMounter: auxiliary glusterfs mount in place [2021-03-11 19:15:45.816960] D [resource(worker /data/storage_c/storage):880:inhibit] DirectMounter: auxiliary glusterfs mount in place [2021-03-11 19:15:46.721534] D [resource(worker /data/storage_a/storage):964:inhibit] DirectMounter: auxiliary glusterfs mount prepared [2021-03-11 19:15:46.721726] I [resource(worker /data/storage_a/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.0997}] [2021-03-11 19:15:46.721796] I [subcmds(worker /data/storage_a/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor [2021-03-11 19:15:46.721971] D [master(worker /data/storage_a/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=xsync}] [2021-03-11 19:15:46.722122] D [monitor(monitor):222:monitor] Monitor: worker(/data/storage_a/storage) connected [2021-03-11 19:15:46.723871] D [master(worker /data/storage_a/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changelog}] [2021-03-11 19:15:46.725100] D [master(worker /data/storage_a/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changeloghistory}] [2021-03-11 19:15:46.732400] D [master(worker /data/storage_a/storage):778:setup_working_dir] _GMaster: changelog working dir/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage [2021-03-11 19:15:46.823477] D [resource(worker /data/storage_c/storage):964:inhibit] DirectMounter: auxiliary glusterfs mount prepared [2021-03-11 19:15:46.823645] I [resource(worker /data/storage_c/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1027}] [2021-03-11 19:15:46.823754] I [subcmds(worker /data/storage_c/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor [2021-03-11 19:15:46.823932] D [master(worker /data/storage_c/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=xsync}] [2021-03-11 19:15:46.823904] D [resource(worker /data/storage_b/storage):964:inhibit] DirectMounter: auxiliary glusterfs mount prepared [2021-03-11 19:15:46.823930] D [monitor(monitor):222:monitor] Monitor: worker(/data/storage_c/storage) connected [2021-03-11 19:15:46.824103] I [resource(worker /data/storage_b/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1020}] [2021-03-11 19:15:46.824184] I [subcmds(worker /data/storage_b/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor [2021-03-11 19:15:46.824340] D [master(worker /data/storage_b/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=xsync}] [2021-03-11 19:15:46.824321] D [monitor(monitor):222:monitor] Monitor: worker(/data/storage_b/storage) connected [2021-03-11 19:15:46.825100] D [master(worker /data/storage_c/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changelog}] [2021-03-11 19:15:46.825414] D [master(worker /data/storage_b/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changelog}] [2021-03-11 19:15:46.826375] D [master(worker /data/storage_b/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changeloghistory}] [2021-03-11 19:15:46.826574] D [master(worker /data/storage_c/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changeloghistory}] [2021-03-11 19:15:46.831506] D [master(worker /data/storage_b/storage):778:setup_working_dir] _GMaster: changelog working dir/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage [2021-03-11 19:15:46.833168] D [master(worker /data/storage_c/storage):778:setup_working_dir] _GMaster: changelog working dir/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage [2021-03-11 19:15:47.275141] D [gsyncd(config-get):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:47.320247] D [gsyncd(config-get):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:47.570877] D [gsyncd(config-get):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:47.615571] D [gsyncd(config-get):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:47.620893] E [syncdutils(worker /data/storage_a/storage):325:log_raise_exception] <top>: connection to peer is broken [2021-03-11 19:15:47.620939] E [syncdutils(worker /data/storage_c/storage):325:log_raise_exception] <top>: connection to peer is broken [2021-03-11 19:15:47.621668] E [syncdutils(worker /data/storage_a/storage):847:errlog] Popen: command returned error [{cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-_AyCOc/79fa3dc75e30f532b4a40bc08c2b10a1.sock geoaccount at 10.0.231.81 /nonexistent/gsyncd slave storage geoaccount at 10.0.231.81::pcic-backup --master-node 10.0.231.91 --master-node-id afc24654-2887-41f6-a9c2-8e835de243b6 --master-brick /data/storage_a/storage --local-node 10.0.231.81 --local-node-id b88dea4f-31ec-416a-9110-3ccdc3910acd --slave-timeout 120 --slave-log-level INFO --slave-gluster-log-level INFO --slave-gluster-command-dir /usr/sbin --master-dist-count 3}, {error=255}] [2021-03-11 19:15:47.621685] E [syncdutils(worker /data/storage_c/storage):847:errlog] Popen: command returned error [{cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-WOgOEu/e15fc58bb13552de0710eaf018209548.sock geoaccount at 10.0.231.82 /nonexistent/gsyncd slave storage geoaccount at 10.0.231.81::pcic-backup --master-node 10.0.231.91 --master-node-id afc24654-2887-41f6-a9c2-8e835de243b6 --master-brick /data/storage_c/storage --local-node 10.0.231.82 --local-node-id be50a8de-3934-4fee-a80d-8e2e99017902 --slave-timeout 120 --slave-log-level INFO --slave-gluster-log-level INFO --slave-gluster-command-dir /usr/sbin --master-dist-count 3}, {error=255}] [2021-03-11 19:15:47.621776] E [syncdutils(worker /data/storage_a/storage):851:logerr] Popen: ssh> Killed by signal 15. [2021-03-11 19:15:47.621819] E [syncdutils(worker /data/storage_c/storage):851:logerr] Popen: ssh> Killed by signal 15. [2021-03-11 19:15:47.621850] E [syncdutils(worker /data/storage_b/storage):325:log_raise_exception] <top>: connection to peer is broken [2021-03-11 19:15:47.622437] E [syncdutils(worker /data/storage_b/storage):847:errlog] Popen: command returned error [{cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-Vy935W/e15fc58bb13552de0710eaf018209548.sock geoaccount at 10.0.231.82 /nonexistent/gsyncd slave storage geoaccount at 10.0.231.81::pcic-backup --master-node 10.0.231.91 --master-node-id afc24654-2887-41f6-a9c2-8e835de243b6 --master-brick /data/storage_b/storage --local-node 10.0.231.82 --local-node-id be50a8de-3934-4fee-a80d-8e2e99017902 --slave-timeout 120 --slave-log-level INFO --slave-gluster-log-level INFO --slave-gluster-command-dir /usr/sbin --master-dist-count 3}, {error=255}] [2021-03-11 19:15:47.622556] E [syncdutils(worker /data/storage_b/storage):851:logerr] Popen: ssh> Killed by signal 15. [2021-03-11 19:15:47.723756] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_a/storage}] [2021-03-11 19:15:47.731405] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] [2021-03-11 19:15:47.825223] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_c/storage}] [2021-03-11 19:15:47.825685] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_b/storage}] [2021-03-11 19:15:47.829011] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] [2021-03-11 19:15:47.830965] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] [2021-03-11 19:15:48.669634] D [gsyncd(monitor-status):303:main] <top>: Using session config file[{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] [2021-03-11 19:15:48.683784] I [subcmds(monitor-status):29:subcmd_monitor_status] <top>: Monitor Status Change [{status=Stopped}] Thanks, ?-Matthew On 3/11/21 9:37 AM, Strahil Nikolov wrote: Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information. I think you have to increase the debug logs for geo-rep session. I will try to find the command necessary to increase it. Best Regards, Strahil Nikolov ? ?????????, 11 ???? 2021 ?., 00:38:41 ?. ???????+2, Matthew Benstead <matthewb at uvic.ca> ??????: Thanks Strahil, Right - I had come across your message in early January that v8 from the CentOS Sig was missing the SELinux rules, and had put SELinux into permissive mode after the upgrade when I saw denied messages in the audit logs. [root at storage01 ~]# sestatus | egrep "^SELinux status|[mM]ode" SELinux status: enabled Current mode: permissive Mode from config file: enforcing Yes - I am using an unprivileged user for georep: [root at pcic-backup01 ~]# gluster-mountbroker status +-------------+-------------+---------------------------+--------------+--------------------------+ |???? NODE | NODE STATUS |???????? MOUNT ROOT |??? GROUP |????????? USERS | +-------------+-------------+---------------------------+--------------+--------------------------+ | 10.0.231.82 |????????? UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup) | |? localhost |????????? UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup) | +-------------+-------------+---------------------------+--------------+--------------------------+ [root at pcic-backup02 ~]# gluster-mountbroker status +-------------+-------------+---------------------------+--------------+--------------------------+ |???? NODE | NODE STATUS |???????? MOUNT ROOT |??? GROUP |????????? USERS | +-------------+-------------+---------------------------+--------------+--------------------------+ | 10.0.231.81 |????????? UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup) | |? localhost |????????? UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup) | +-------------+-------------+---------------------------+--------------+--------------------------+ Thanks, -Matthew -- Matthew Benstead System AdministratorPacific Climate Impacts ConsortiumUniversity of Victoria, UH1PO Box 1800, STN CSCVictoria, BC, V8W 2Y2Phone: +1-250-721-8432Email: matthewb at uvic.ca On 3/10/21 2:11 PM, Strahil Nikolov wrote: ?? ??Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information. I have tested georep on v8.3 and it was running quite well untill you involve SELINUX. Are you using SELINUX ? Are you using unprivileged user for the georep ? Also, you can check https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html/administration_guide/sect-troubleshooting_geo-replication . Best Regards, Strahil Nikolov ?? ?? On Thu, Mar 11, 2021 at 0:03, Matthew Benstead <matthewb at uvic.ca> wrote: ?? ?? Hello, I recently upgraded my Distributed-Replicate cluster from Gluster 7.9 to 8.3 on CentOS7 using the CentOS Storage SIG packages. I had geo-replication syncing properly before the upgrade, but not it is not working after. After I had upgraded both master and slave clusters I attempted to start geo-replication again, but it goes to faulty quickly: [root at storage01 ~]# gluster volume geo-replication storage geoaccount at 10.0.231.81::pcic-backup start Starting geo-replication session between storage &??geoaccount at 10.0.231.81::pcic-backup has been successful\ [root at storage01 ~]# gluster volume geo-replication status MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED --------------------------------------------------------------------------------------------------------------------------------------------------------------------- 10.0.231.91 storage /data/storage_a/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup N/A Faulty N/A N/A 10.0.231.91 storage /data/storage_c/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup N/A Faulty N/A N/A 10.0.231.91 storage /data/storage_b/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup N/A Faulty N/A N/A 10.0.231.92 storage /data/storage_b/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup N/A Faulty N/A N/A 10.0.231.92 storage /data/storage_a/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup N/A Faulty N/A N/A 10.0.231.92 storage /data/storage_c/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup N/A Faulty N/A N/A 10.0.231.93 storage /data/storage_c/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup N/A Faulty N/A N/A 10.0.231.93 storage /data/storage_b/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup N/A Faulty N/A N/A 10.0.231.93 storage /data/storage_a/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup N/A Faulty N/A N/A [root at storage01 ~]# gluster volume geo-replication storage geoaccount at 10.0.231.81::pcic-backup stop Stopping geo-replication session between storage &??geoaccount at 10.0.231.81::pcic-backup has been successful I went through the gsyncd logs and see it attempts to go back through the changlogs - which would make sense - but fails: [2021-03-10 19:18:42.165807] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] [2021-03-10 19:18:42.166136] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_a/storage}, {slave_node=10.0.231.81}] [2021-03-10 19:18:42.167829] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_c/storage}, {slave_node=10.0.231.82}] [2021-03-10 19:18:42.172343] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] [2021-03-10 19:18:42.172580] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_b/storage}, {slave_node=10.0.231.82}] [2021-03-10 19:18:42.235574] I [resource(worker /data/storage_c/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... [2021-03-10 19:18:42.236613] I [resource(worker /data/storage_a/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... [2021-03-10 19:18:42.238614] I [resource(worker /data/storage_b/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... [2021-03-10 19:18:44.144856] I [resource(worker /data/storage_b/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9059}] [2021-03-10 19:18:44.145065] I [resource(worker /data/storage_b/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-10 19:18:44.162873] I [resource(worker /data/storage_a/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9259}] [2021-03-10 19:18:44.163412] I [resource(worker /data/storage_a/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-10 19:18:44.167506] I [resource(worker /data/storage_c/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9316}] [2021-03-10 19:18:44.167746] I [resource(worker /data/storage_c/storage):1116:connect] GLUSTER: Mounting gluster volume locally... [2021-03-10 19:18:45.251372] I [resource(worker /data/storage_b/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1062}] [2021-03-10 19:18:45.251583] I [subcmds(worker /data/storage_b/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor [2021-03-10 19:18:45.271950] I [resource(worker /data/storage_c/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1041}] [2021-03-10 19:18:45.272118] I [subcmds(worker /data/storage_c/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor [2021-03-10 19:18:45.275180] I [resource(worker /data/storage_a/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1116}] [2021-03-10 19:18:45.275361] I [subcmds(worker /data/storage_a/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor [2021-03-10 19:18:47.265618] I [master(worker /data/storage_b/storage):1645:register] _GMaster: Working dir [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage}] [2021-03-10 19:18:47.265954] I [resource(worker /data/storage_b/storage):1292:service_loop] GLUSTER: Register time [{time=1615403927}] [2021-03-10 19:18:47.276746] I [gsyncdstatus(worker /data/storage_b/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}] [2021-03-10 19:18:47.281194] I [gsyncdstatus(worker /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}] [2021-03-10 19:18:47.281404] I [master(worker /data/storage_b/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666552, 0)}, {entry_stime=(1614664113, 0)}, {etime=1615403927}] [2021-03-10 19:18:47.285340] I [master(worker /data/storage_c/storage):1645:register] _GMaster: Working dir [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage}] [2021-03-10 19:18:47.285579] I [resource(worker /data/storage_c/storage):1292:service_loop] GLUSTER: Register time [{time=1615403927}] [2021-03-10 19:18:47.287383] I [master(worker /data/storage_a/storage):1645:register] _GMaster: Working dir [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage}] [2021-03-10 19:18:47.287697] I [resource(worker /data/storage_a/storage):1292:service_loop] GLUSTER: Register time [{time=1615403927}] [2021-03-10 19:18:47.298415] I [gsyncdstatus(worker /data/storage_c/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}] [2021-03-10 19:18:47.301342] I [gsyncdstatus(worker /data/storage_a/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}] [2021-03-10 19:18:47.304183] I [gsyncdstatus(worker /data/storage_c/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}] [2021-03-10 19:18:47.304418] I [master(worker /data/storage_c/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666552, 0)}, {entry_stime=(1614664108, 0)}, {etime=1615403927}] [2021-03-10 19:18:47.305294] E [resource(worker /data/storage_c/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}] [2021-03-10 19:18:47.308124] I [gsyncdstatus(worker /data/storage_a/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}] [2021-03-10 19:18:47.308509] I [master(worker /data/storage_a/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666553, 0)}, {entry_stime=(1614664115, 0)}, {etime=1615403927}] [2021-03-10 19:18:47.357470] E [resource(worker /data/storage_b/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}] [2021-03-10 19:18:47.383949] E [resource(worker /data/storage_a/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}] [2021-03-10 19:18:48.255340] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_b/storage}] [2021-03-10 19:18:48.260052] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] [2021-03-10 19:18:48.275651] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_c/storage}] [2021-03-10 19:18:48.278064] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_a/storage}] [2021-03-10 19:18:48.280453] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] [2021-03-10 19:18:48.282274] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] [2021-03-10 19:18:58.275702] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] [2021-03-10 19:18:58.276041] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_b/storage}, {slave_node=10.0.231.82}] [2021-03-10 19:18:58.296252] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] [2021-03-10 19:18:58.296506] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_c/storage}, {slave_node=10.0.231.82}] [2021-03-10 19:18:58.301290] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] [2021-03-10 19:18:58.301521] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_a/storage}, {slave_node=10.0.231.81}] [2021-03-10 19:18:58.345817] I [resource(worker /data/storage_b/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... [2021-03-10 19:18:58.361268] I [resource(worker /data/storage_c/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... [2021-03-10 19:18:58.367985] I [resource(worker /data/storage_a/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... [2021-03-10 19:18:59.115143] I [subcmds(monitor-status):29:subcmd_monitor_status] <top>: Monitor Status Change [{status=Stopped}] It seems like there is an issue selecting the changelogs - perhaps similar to this issue? https://github.com/gluster/glusterfs/issues/1766 [root at storage01 storage_10.0.231.81_pcic-backup]# cat changes-data-storage_a-storage.log [2021-03-10 19:18:45.284764] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_a/storage}, {notify_filter=1}] [2021-03-10 19:18:45.285275] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}] [2021-03-10 19:18:45.285269] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}] [2021-03-10 19:18:45.286615] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 21 [2021-03-10 19:18:47.308607] I [MSGID: 132035] [gf-history-changelog.c:837:gf_history_changelog] 0-gfchangelog: Requesting historical changelogs [{start=1614666553}, {end=1615403927}] [2021-03-10 19:18:47.308659] I [MSGID: 132019] [gf-history-changelog.c:755:gf_changelog_extract_min_max] 0-gfchangelog: changelogs min max [{min=1597342860}, {max=1615403927}, {total_changelogs=1250878}] [2021-03-10 19:18:47.383774] E [MSGID: 132009] [gf-history-changelog.c:941:gf_history_changelog] 0-gfchangelog: wrong result [{for=end}, {start=1615403927}, {idx=1250877}] [root at storage01 storage_10.0.231.81_pcic-backup]# tail -7 changes-data-storage_b-storage.log [2021-03-10 19:18:45.263211] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}] [2021-03-10 19:18:45.263151] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_b/storage}, {notify_filter=1}] [2021-03-10 19:18:45.263294] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}] [2021-03-10 19:18:45.264598] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 23 [2021-03-10 19:18:47.281499] I [MSGID: 132035] [gf-history-changelog.c:837:gf_history_changelog] 0-gfchangelog: Requesting historical changelogs [{start=1614666552}, {end=1615403927}] [2021-03-10 19:18:47.281551] I [MSGID: 132019] [gf-history-changelog.c:755:gf_changelog_extract_min_max] 0-gfchangelog: changelogs min max [{min=1597342860}, {max=1615403927}, {total_changelogs=1258258}] [2021-03-10 19:18:47.357244] E [MSGID: 132009] [gf-history-changelog.c:941:gf_history_changelog] 0-gfchangelog: wrong result [{for=end}, {start=1615403927}, {idx=1258257}] Any ideas on where to debug this? I'd prefer not to have to remove and re-sync everything as there is about 240TB on the cluster... Thanks, -Matthew ________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users at gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20210317/ed056612/attachment-0001.html>
Matthew Benstead
2021-Mar-17 21:11 UTC
[Gluster-users] GeoRep Faulty after Gluster 7 to 8 upgrade - gfchangelog: wrong result
Yes, I've run through everything short of regenerating the keys and creating the session again with no errors. Everything looks ok. But I did notice that the changelog format had changed, instead of them being dumped into one directory, they now seem to be separated in year/month/day directories... Looks like this change in 8.0: https://github.com/gluster/glusterfs/issues/154 [root at storage01 changelogs]# ls -lh | head total 16G drw-------. 3 root root?? 24 Mar? 9 11:34 2021 -rw-r--r--. 1 root root?? 51 Mar 17 13:19 CHANGELOG -rw-r--r--. 1 root root? 13K Aug 13? 2020 CHANGELOG.1597343197 -rw-r--r--. 1 root root? 51K Aug 13? 2020 CHANGELOG.1597343212 -rw-r--r--. 1 root root? 86K Aug 13? 2020 CHANGELOG.1597343227 -rw-r--r--. 1 root root? 99K Aug 13? 2020 CHANGELOG.1597343242 -rw-r--r--. 1 root root? 69K Aug 13? 2020 CHANGELOG.1597343257 -rw-r--r--. 1 root root? 69K Aug 13? 2020 CHANGELOG.1597343272 -rw-r--r--. 1 root root? 72K Aug 13? 2020 CHANGELOG.1597343287 [root at storage01 changelogs]# ls -lh | tail -rw-r--r--. 1 root root?? 92 Mar? 1 21:33 CHANGELOG.1614663193 -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663731 -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663760 -rw-r--r--. 1 root root? 511 Mar? 1 21:47 CHANGELOG.1614664043 -rw-r--r--. 1 root root? 536 Mar? 1 21:48 CHANGELOG.1614664101 -rw-r--r--. 1 root root 2.8K Mar? 1 21:48 CHANGELOG.1614664116 -rw-r--r--. 1 root root?? 92 Mar? 1 22:20 CHANGELOG.1614666061 -rw-r--r--. 1 root root?? 92 Mar? 1 22:29 CHANGELOG.1614666554 drw-------. 2 root root?? 10 May? 7? 2020 csnap drw-------. 2 root root?? 38 Aug 13? 2020 htime [root at storage01 changelogs]# ls -lh 2021/03/09/ total 8.0K -rw-r--r--. 1 root root 51 Mar? 9 11:26 CHANGELOG.1615318474 -rw-r--r--. 1 root root 51 Mar? 9 12:19 CHANGELOG.1615321197 [root at storage01 changelogs]# ls -lh 2021/03/15/ total 4.0K -rw-r--r--. 1 root root 51 Mar 15 13:38 CHANGELOG.1615842490 [root at storage01 changelogs]# ls -lh 2021/03/16 total 4.0K -rw-r--r--. 1 root root 331 Mar 16 12:04 CHANGELOG.1615921482 But it looks like the htime file still records them... [root at storage01 changelogs]# ls -lh htime total 84M -rw-r--r--. 1 root root 84M Mar 17 13:31 HTIME.1597342860 [root at storage01 changelogs]# head -c 256 htime/HTIME.1597342860 /data/storage_a/storage/.glusterfs/changelogs/changelog.1597342875/data/storage_a/storage/.glusterfs/changelogs/changelog.1597342890/data/storage_a/storage/.glusterfs/changelogs/changelog.1597342904/data/storage_a/storage/.glusterfs/changelogs/changelog[root at storage01 changelogs]# [root at storage01 changelogs]# tail -c 256 htime/HTIME.1597342860 /changelog.1616013484/data/storage_a/storage/.glusterfs/changelogs/2021/03/17/changelog.1616013499/data/storage_a/storage/.glusterfs/changelogs/2021/03/17/changelog.1616013514/data/storage_a/storage/.glusterfs/changelogs/2021/03/17/changelog.1616013529[root at storage01 changelogs]# And there seems to be an xattr for time in the brick root - presumably for when changelogs were enabled: [root at storage01 changelogs]# getfattr -d -m. -e hex /data/storage_a/storage 2>&1 | egrep xtime trusted.glusterfs.cf94a8f2-324b-40b3-bf72-c3766100ea99.xtime=0x60510140000ef317 Reading through the changes, it looks like there is a script to convert from one format to the other... I didn't see anything in the release notes for 8.0 about it... this seems like it could be the fix, and explain why gluster can't get through the changelogs....? Thoughts? * https://github.com/gluster/glusterfs/issues/154#issuecomment-585701964 * https://review.gluster.org/#/c/glusterfs/+/24121/ Thanks, ?-Matthew -- Matthew Benstead System Administrator Pacific Climate Impacts Consortium <https://pacificclimate.org/> University of Victoria, UH1 PO Box 1800, STN CSC Victoria, BC, V8W 2Y2 Phone: +1-250-721-8432 Email: matthewb at uvic.ca On 3/16/21 9:36 PM, Strahil Nikolov wrote:> Notice: This message was sent from outside the University of Victoria > email system. Please be cautious with links and sensitive information. > > Have you verified all steps for creating the geo-replication ? > > If yes , maybe using "reset-sync-time + delete + create" makes > sense.Keep in mind that it will take a long time once the geo-rep is > established again. > > > Best Regards, > Strahil Nikolov > > On Tue, Mar 16, 2021 at 22:34, Matthew Benstead > <matthewb at uvic.ca> wrote: > Thanks Strahil, > > I wanted to make sure the issue wasn't occurring because there > were no new changes to sync from the master volume. So I created > some files and restarted the sync, but it had no effect. > > [root at storage01 ~]# cd /storage2/home/test/ > [root at storage01 test]# for nums in {1,2,3,4,5,6,7,8,9,0}; do touch > $nums.txt; done > > [root at storage01 test]# gluster volume geo-replication storage > geoaccount at 10.0.231.81::pcic-backup > <mailto:geoaccount at 10.0.231.81::pcic-backup> start > Starting geo-replication session between storage & > geoaccount at 10.0.231.81::pcic-backup > <mailto:geoaccount at 10.0.231.81::pcic-backup> has been successful > [root at storage01 test]# gluster volume geo-replication status > ? > MASTER NODE??? MASTER VOL??? MASTER BRICK?????????????? SLAVE > USER??? SLAVE??????????????????????????????????????? SLAVE NODE??? > STATUS???????????? CRAWL STATUS??? LAST_SYNCED????????? > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > 10.0.231.91??? storage?????? /data/storage_a/storage??? > geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup > <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? > N/A?????????? Initializing...??? N/A???????????? N/A????????????????? > 10.0.231.91??? storage?????? /data/storage_c/storage??? > geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup > <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? > N/A?????????? Initializing...??? N/A???????????? N/A????????????????? > 10.0.231.91??? storage?????? /data/storage_b/storage??? > geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup > <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? > N/A?????????? Initializing...??? N/A???????????? N/A????????????????? > 10.0.231.93??? storage?????? /data/storage_c/storage??? > geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup > <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? > N/A?????????? Initializing...??? N/A???????????? N/A????????????????? > 10.0.231.93??? storage?????? /data/storage_b/storage??? > geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup > <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? > N/A?????????? Initializing...??? N/A???????????? N/A????????????????? > 10.0.231.93??? storage?????? /data/storage_a/storage??? > geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup > <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? > N/A?????????? Initializing...??? N/A???????????? N/A????????????????? > 10.0.231.92??? storage?????? /data/storage_b/storage??? > geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup > <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? > N/A?????????? Initializing...??? N/A???????????? N/A????????????????? > 10.0.231.92??? storage?????? /data/storage_a/storage??? > geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup > <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? > N/A?????????? Initializing...??? N/A???????????? N/A????????????????? > 10.0.231.92??? storage?????? /data/storage_c/storage??? > geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup > <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? > N/A?????????? Initializing...??? N/A???????????? N/A????????????????? > [root at storage01 test]# gluster volume geo-replication status > ? > MASTER NODE??? MASTER VOL??? MASTER BRICK?????????????? SLAVE > USER??? SLAVE??????????????????????????????????????? SLAVE NODE??? > STATUS??? CRAWL STATUS??? LAST_SYNCED????????? > --------------------------------------------------------------------------------------------------------------------------------------------------------------------- > 10.0.231.91??? storage?????? /data/storage_a/storage??? > geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup > <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? > N/A?????????? Faulty??? N/A???????????? N/A????????????????? > 10.0.231.91??? storage?????? /data/storage_c/storage??? > geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup > <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? > N/A?????????? Faulty??? N/A???????????? N/A????????????????? > 10.0.231.91??? storage?????? /data/storage_b/storage??? > geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup > <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? > N/A?????????? Faulty??? N/A???????????? N/A????????????????? > 10.0.231.93??? storage?????? /data/storage_c/storage??? > geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup > <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? > N/A?????????? Faulty??? N/A???????????? N/A????????????????? > 10.0.231.93??? storage?????? /data/storage_b/storage??? > geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup > <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? > N/A?????????? Faulty??? N/A???????????? N/A????????????????? > 10.0.231.93??? storage?????? /data/storage_a/storage??? > geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup > <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? > N/A?????????? Faulty??? N/A???????????? N/A????????????????? > 10.0.231.92??? storage?????? /data/storage_b/storage??? > geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup > <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? > N/A?????????? Faulty??? N/A???????????? N/A????????????????? > 10.0.231.92??? storage?????? /data/storage_a/storage??? > geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup > <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? > N/A?????????? Faulty??? N/A???????????? N/A????????????????? > 10.0.231.92??? storage?????? /data/storage_c/storage??? > geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup > <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? > N/A?????????? Faulty??? N/A???????????? N/A????????????????? > [root at storage01 test]# gluster volume geo-replication storage > geoaccount at 10.0.231.81::pcic-backup > <mailto:geoaccount at 10.0.231.81::pcic-backup> stop > Stopping geo-replication session between storage & > geoaccount at 10.0.231.81::pcic-backup > <mailto:geoaccount at 10.0.231.81::pcic-backup> has been successful > > Still getting the same error about the history crawl failing: > > [2021-03-16 19:05:05.227677] I [MSGID: 132035] > [gf-history-changelog.c:837:gf_history_changelog] 0-gfchangelog: > Requesting historical changelogs [{start=1614666552}, > {end=1615921505}] > [2021-03-16 19:05:05.227733] I [MSGID: 132019] > [gf-history-changelog.c:755:gf_changelog_extract_min_max] > 0-gfchangelog: changelogs min max [{min=1597342860}, > {max=1615921502}, {total_changelogs=1300114}] > [2021-03-16 19:05:05.408567] E [MSGID: 132009] > [gf-history-changelog.c:941:gf_history_changelog] 0-gfchangelog: > wrong result [{for=end}, {start=1615921502}, {idx=1300113}] > > > [2021-03-16 19:05:05.228092] I [resource(worker > /data/storage_c/storage):1292:service_loop] GLUSTER: Register time > [{time=1615921505}] > [2021-03-16 19:05:05.228626] D [repce(worker > /data/storage_c/storage):195:push] RepceClient: call > 124117:140500837320448:1615921505.23 keep_alive(None,) ... > [2021-03-16 19:05:05.230076] D [repce(worker > /data/storage_c/storage):215:__call__] RepceClient: call > 124117:140500837320448:1615921505.23 keep_alive -> 1 > [2021-03-16 19:05:05.230693] D [master(worker > /data/storage_c/storage):540:crawlwrap] _GMaster: primary master > with volume id cf94a8f2-324b-40b3-bf72-c3766100ea99 ... > [2021-03-16 19:05:05.237607] I [gsyncdstatus(worker > /data/storage_c/storage):281:set_active] GeorepStatus: Worker > Status Change [{status=Active}] > [2021-03-16 19:05:05.242046] I [gsyncdstatus(worker > /data/storage_c/storage):253:set_worker_crawl_status] > GeorepStatus: Crawl Status Change [{status=History Crawl}] > [2021-03-16 19:05:05.242450] I [master(worker > /data/storage_c/storage):1559:crawl] _GMaster: starting history > crawl [{turns=1}, {stime=(1614666552, 0)}, > {entry_stime=(1614664108, 0)}, {etime=1615921505}] > [2021-03-16 19:05:05.244151] E [resource(worker > /data/storage_c/storage):1312:service_loop] GLUSTER: Changelog > History Crawl failed [{error=[Errno 0] Success}] > [2021-03-16 19:05:05.394129] E [resource(worker > /data/storage_a/storage):1312:service_loop] GLUSTER: Changelog > History Crawl failed [{error=[Errno 0] Success}] > [2021-03-16 19:05:05.408759] E [resource(worker > /data/storage_b/storage):1312:service_loop] GLUSTER: Changelog > History Crawl failed [{error=[Errno 0] Success}] > [2021-03-16 19:05:06.158694] I [monitor(monitor):228:monitor] > Monitor: worker died in startup phase > [{brick=/data/storage_a/storage}] > [2021-03-16 19:05:06.163052] I > [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker > Status Change [{status=Faulty}] > [2021-03-16 19:05:06.204464] I [monitor(monitor):228:monitor] > Monitor: worker died in startup phase > [{brick=/data/storage_b/storage}] > [2021-03-16 19:05:06.208961] I > [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker > Status Change [{status=Faulty}] > [2021-03-16 19:05:06.220495] I [monitor(monitor):228:monitor] > Monitor: worker died in startup phase > [{brick=/data/storage_c/storage}] > [2021-03-16 19:05:06.223947] I > [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker > Status Change [{status=Faulty}] > > I confirmed NTP is working: > > > pcic-backup02 | CHANGED | rc=0 >> > ???? remote?????????? refid????? st t when poll reach?? delay?? > offset? jitter > =============================================================================> +s216-232-132-95 68.69.221.61???? 2 u?? 29 1024? 377?? 24.141??? > 2.457?? 1.081 > *yyz-1.ip.0xt.ca 206.108.0.131??? 2 u? 257 1024? 377?? 57.119?? > -0.084?? 5.625 > +ip102.ip-198-27 192.168.10.254?? 2 u? 189 1024? 377?? 64.227?? > -3.012?? 8.867 > > storage03 | CHANGED | rc=0 >> > ???? remote?????????? refid????? st t when poll reach?? delay?? > offset? jitter > =============================================================================> *198.161.203.36? 128.233.150.93?? 2 u?? 36 1024? 377?? 16.055?? > -0.381?? 0.318 > +s206-75-147-25. 192.168.10.254?? 2 u? 528 1024? 377?? 23.648?? > -6.196?? 4.803 > +time.cloudflare 10.69.8.80?????? 3 u? 121 1024? 377??? 2.408??? > 0.507?? 0.791 > > storage02 | CHANGED | rc=0 >> > ???? remote?????????? refid????? st t when poll reach?? delay?? > offset? jitter > =============================================================================> *198.161.203.36? 128.233.150.93?? 2 u? 918 1024? 377?? 15.952??? > 0.226?? 0.197 > +linuxgeneration 16.164.40.197??? 2 u?? 88 1024? 377?? 62.692?? > -1.160?? 2.007 > +dns3.switch.ca? 206.108.0.131??? 2 u? 857 1024? 377?? 27.315??? > 0.778?? 0.483 > > storage01 | CHANGED | rc=0 >> > ???? remote?????????? refid????? st t when poll reach?? delay?? > offset? jitter > =============================================================================> +198.161.203.36? 128.233.150.93?? 2 u? 121 1024? 377?? 16.069??? > 1.016?? 0.195 > +zero.gotroot.ca 30.114.5.31????? 2 u? 543 1024? 377??? 5.106?? > -2.462?? 4.923 > *ntp3.torix.ca?? .PTP0.?????????? 1 u? 300 1024? 377?? 54.010??? > 2.421? 15.182 > > pcic-backup01 | CHANGED | rc=0 >> > ???? remote?????????? refid????? st t when poll reach?? delay?? > offset? jitter > =============================================================================> *dns3.switch.ca? 206.108.0.131??? 2 u? 983 1024? 377?? 26.990??? > 0.523?? 1.389 > +dns2.switch.ca? 206.108.0.131??? 2 u? 689 1024? 377?? 26.975?? > -0.257?? 0.467 > +64.ip-54-39-23. 214.176.184.39?? 2 u? 909 1024? 377?? 64.262?? > -0.604?? 6.129 > > And everything is working on the same version of gluster: > > pcic-backup02 | CHANGED | rc=0 >> > glusterfs 8.3 > pcic-backup01 | CHANGED | rc=0 >> > glusterfs 8.3 > storage02 | CHANGED | rc=0 >> > glusterfs 8.3 > storage01 | CHANGED | rc=0 >> > glusterfs 8.3 > storage03 | CHANGED | rc=0 >> > glusterfs 8.3 > > SSH works, and the backup user/group is configured with mountbroker: > > [root at storage01 ~]# ssh -i /root/.ssh/id_rsa > geoaccount at 10.0.231.81 <mailto:geoaccount at 10.0.231.81> uname -a > Linux pcic-backup01 3.10.0-1160.15.2.el7.x86_64 #1 SMP Wed Feb 3 > 15:06:38 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux > [root at storage01 ~]# ssh -i /root/.ssh/id_rsa > geoaccount at 10.0.231.82 <mailto:geoaccount at 10.0.231.82> uname -a > Linux pcic-backup02 3.10.0-1160.15.2.el7.x86_64 #1 SMP Wed Feb 3 > 15:06:38 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux > > > [root at pcic-backup01 ~]# grep geo /etc/passwd > geoaccount:x:1000:1000::/home/geoaccount:/bin/bash > [root at pcic-backup01 ~]# grep geo /etc/group > geogroup:x:1000:geoaccount > geoaccount:x:1001:geoaccount > > [root at pcic-backup01 ~]# gluster-mountbroker status > +-------------+-------------+---------------------------+--------------+--------------------------+ > |???? NODE??? | NODE STATUS |???????? MOUNT ROOT??????? |??? > GROUP???? |????????? USERS?????????? | > +-------------+-------------+---------------------------+--------------+--------------------------+ > | 10.0.231.82 |????????? UP | /var/mountbroker-root(OK) | > geogroup(OK) | geoaccount(pcic-backup)? | > |? localhost? |????????? UP | /var/mountbroker-root(OK) | > geogroup(OK) | geoaccount(pcic-backup)? | > +-------------+-------------+---------------------------+--------------+--------------------------+ > > > > > So, then if I'm going to have to resync, what is the best way to > do this? > > With delete or delete reset-sync-time ??? > https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/html/administration_guide/sect-starting_geo-replication#Deleting_a_Geo-replication_Session > <https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/html/administration_guide/sect-starting_geo-replication#Deleting_a_Geo-replication_Session> > > > > Erasing the index? So I don't have to transfer the files again > that are already on the backup? > > * https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.1/html/administration_guide/sect-troubleshooting_geo-replication#Synchronization_Is_Not_Complete > <https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.1/html/administration_guide/sect-troubleshooting_geo-replication#Synchronization_Is_Not_Complete> > > * https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Administrator%20Guide/Geo%20Replication/#best-practices > <https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Administrator%20Guide/Geo%20Replication/#best-practices> > > > > > Is it possible to use the special-sync-mode? option from here: > https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/html/administration_guide/sect-disaster_recovery > <https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/html/administration_guide/sect-disaster_recovery> > > > > Thoughts? > > Thanks, > ?-Matthew > -- > > On 3/12/21 3:31 PM, Strahil Nikolov wrote: >> Notice: This message was sent from outside the University of >> Victoria email system. Please be cautious with links and >> sensitive information. >> >> Usually, when I'm stuck - I just start over. >> For example, check the prerequisites: >> - Is ssh available (no firewall blocking) >> - Is time sync enabled (ntp/chrony) >> - Is DNS ok on all hosts (including PTR records) >> - Is the gluster version the same on all nodes (primary & secondary) >> >> Then start over as if the geo rep was never existing. For example >> , stop it and start over with the secondary nodes's checks >> (mountbroker, user, group) . >> >> Most probably somwthing will come up and you will fix it. >> >> In worst case scenario, you will need to clean ip the geo-rep and >> start fresh. >> >> >> Best Regards, >> Strahil Nikolov >> >> On Fri, Mar 12, 2021 at 20:01, Matthew Benstead >> <matthewb at uvic.ca> <mailto:matthewb at uvic.ca> wrote: >> Hi Strahil, >> >> Yes, SELinux was put into permissive mode on the secondary >> nodes as well: >> >> [root at pcic-backup01 ~]# sestatus | egrep -i? "^SELinux >> status|mode" >> SELinux status:???????????????? enabled >> Current mode:?????????????????? permissive >> Mode from config file:????????? enforcing >> >> [root at pcic-backup02 ~]# sestatus | egrep -i? "^SELinux >> status|mode" >> SELinux status:???????????????? enabled >> Current mode:?????????????????? permissive >> Mode from config file:????????? enforcing >> >> The secondary server logs didn't show anything interesting: >> >> gsyncd.log: >> >> [2021-03-11 19:15:28.81820] I [resource(slave >> 10.0.231.92/data/storage_c/storage):1116:connect] GLUSTER: >> Mounting gluster volume locally... >> [2021-03-11 19:15:28.101819] I [resource(slave >> 10.0.231.91/data/storage_a/storage):1116:connect] GLUSTER: >> Mounting gluster volume locally... >> [2021-03-11 19:15:28.107012] I [resource(slave >> 10.0.231.93/data/storage_c/storage):1116:connect] GLUSTER: >> Mounting gluster volume locally... >> [2021-03-11 19:15:28.124567] I [resource(slave >> 10.0.231.93/data/storage_b/storage):1116:connect] GLUSTER: >> Mounting gluster volume locally... >> [2021-03-11 19:15:28.128145] I [resource(slave >> 10.0.231.93/data/storage_a/storage):1116:connect] GLUSTER: >> Mounting gluster volume locally... >> [2021-03-11 19:15:29.425739] I [resource(slave >> 10.0.231.93/data/storage_c/storage):1139:connect] GLUSTER: >> Mounted gluster volume [{duration=1.3184}] >> [2021-03-11 19:15:29.427448] I [resource(slave >> 10.0.231.93/data/storage_c/storage):1166:service_loop] >> GLUSTER: slave listening >> [2021-03-11 19:15:29.433340] I [resource(slave >> 10.0.231.93/data/storage_b/storage):1139:connect] GLUSTER: >> Mounted gluster volume [{duration=1.3083}] >> [2021-03-11 19:15:29.434452] I [resource(slave >> 10.0.231.91/data/storage_a/storage):1139:connect] GLUSTER: >> Mounted gluster volume [{duration=1.3321}] >> [2021-03-11 19:15:29.434314] I [resource(slave >> 10.0.231.93/data/storage_b/storage):1166:service_loop] >> GLUSTER: slave listening >> [2021-03-11 19:15:29.435575] I [resource(slave >> 10.0.231.91/data/storage_a/storage):1166:service_loop] >> GLUSTER: slave listening >> [2021-03-11 19:15:29.439769] I [resource(slave >> 10.0.231.92/data/storage_c/storage):1139:connect] GLUSTER: >> Mounted gluster volume [{duration=1.3576}] >> [2021-03-11 19:15:29.440998] I [resource(slave >> 10.0.231.92/data/storage_c/storage):1166:service_loop] >> GLUSTER: slave listening >> [2021-03-11 19:15:29.454745] I [resource(slave >> 10.0.231.93/data/storage_a/storage):1139:connect] GLUSTER: >> Mounted gluster volume [{duration=1.3262}] >> [2021-03-11 19:15:29.456192] I [resource(slave >> 10.0.231.93/data/storage_a/storage):1166:service_loop] >> GLUSTER: slave listening >> [2021-03-11 19:15:32.594865] I [repce(slave >> 10.0.231.92/data/storage_c/storage):96:service_loop] >> RepceServer: terminating on reaching EOF. >> [2021-03-11 19:15:32.607815] I [repce(slave >> 10.0.231.93/data/storage_c/storage):96:service_loop] >> RepceServer: terminating on reaching EOF. >> [2021-03-11 19:15:32.647663] I [repce(slave >> 10.0.231.93/data/storage_b/storage):96:service_loop] >> RepceServer: terminating on reaching EOF. >> [2021-03-11 19:15:32.656280] I [repce(slave >> 10.0.231.91/data/storage_a/storage):96:service_loop] >> RepceServer: terminating on reaching EOF. >> [2021-03-11 19:15:32.668299] I [repce(slave >> 10.0.231.93/data/storage_a/storage):96:service_loop] >> RepceServer: terminating on reaching EOF. >> [2021-03-11 19:15:44.260689] I [resource(slave >> 10.0.231.92/data/storage_c/storage):1116:connect] GLUSTER: >> Mounting gluster volume locally... >> [2021-03-11 19:15:44.271457] I [resource(slave >> 10.0.231.93/data/storage_c/storage):1116:connect] GLUSTER: >> Mounting gluster volume locally... >> [2021-03-11 19:15:44.271883] I [resource(slave >> 10.0.231.93/data/storage_b/storage):1116:connect] GLUSTER: >> Mounting gluster volume locally... >> [2021-03-11 19:15:44.279670] I [resource(slave >> 10.0.231.91/data/storage_a/storage):1116:connect] GLUSTER: >> Mounting gluster volume locally... >> [2021-03-11 19:15:44.284261] I [resource(slave >> 10.0.231.93/data/storage_a/storage):1116:connect] GLUSTER: >> Mounting gluster volume locally... >> [2021-03-11 19:15:45.614280] I [resource(slave >> 10.0.231.93/data/storage_b/storage):1139:connect] GLUSTER: >> Mounted gluster volume [{duration=1.3419}] >> [2021-03-11 19:15:45.615622] I [resource(slave >> 10.0.231.93/data/storage_b/storage):1166:service_loop] >> GLUSTER: slave listening >> [2021-03-11 19:15:45.617986] I [resource(slave >> 10.0.231.93/data/storage_c/storage):1139:connect] GLUSTER: >> Mounted gluster volume [{duration=1.3461}] >> [2021-03-11 19:15:45.618180] I [resource(slave >> 10.0.231.91/data/storage_a/storage):1139:connect] GLUSTER: >> Mounted gluster volume [{duration=1.3380}] >> [2021-03-11 19:15:45.619539] I [resource(slave >> 10.0.231.91/data/storage_a/storage):1166:service_loop] >> GLUSTER: slave listening >> [2021-03-11 19:15:45.618999] I [resource(slave >> 10.0.231.93/data/storage_c/storage):1166:service_loop] >> GLUSTER: slave listening >> [2021-03-11 19:15:45.620843] I [resource(slave >> 10.0.231.93/data/storage_a/storage):1139:connect] GLUSTER: >> Mounted gluster volume [{duration=1.3361}] >> [2021-03-11 19:15:45.621347] I [resource(slave >> 10.0.231.92/data/storage_c/storage):1139:connect] GLUSTER: >> Mounted gluster volume [{duration=1.3604}] >> [2021-03-11 19:15:45.622179] I [resource(slave >> 10.0.231.93/data/storage_a/storage):1166:service_loop] >> GLUSTER: slave listening >> [2021-03-11 19:15:45.622541] I [resource(slave >> 10.0.231.92/data/storage_c/storage):1166:service_loop] >> GLUSTER: slave listening >> [2021-03-11 19:15:47.626054] I [repce(slave >> 10.0.231.91/data/storage_a/storage):96:service_loop] >> RepceServer: terminating on reaching EOF. >> [2021-03-11 19:15:48.778399] I [repce(slave >> 10.0.231.93/data/storage_c/storage):96:service_loop] >> RepceServer: terminating on reaching EOF. >> [2021-03-11 19:15:48.778491] I [repce(slave >> 10.0.231.92/data/storage_c/storage):96:service_loop] >> RepceServer: terminating on reaching EOF. >> [2021-03-11 19:15:48.796854] I [repce(slave >> 10.0.231.93/data/storage_a/storage):96:service_loop] >> RepceServer: terminating on reaching EOF. >> [2021-03-11 19:15:48.800697] I [repce(slave >> 10.0.231.93/data/storage_b/storage):96:service_loop] >> RepceServer: terminating on reaching EOF. >> >> The mnt geo-rep files were also uninteresting: >> [2021-03-11 19:15:28.250150] I [MSGID: 100030] >> [glusterfsd.c:2689:main] 0-/usr/sbin/glusterfs: Started >> running version [{arg=/usr/sbin/glusterfs}, {version=8.3}, >> {cmdlinestr=/usr/sbin/glusterfs --user-map-root=g >> eoaccount --aux-gfid-mount --acl --log-level=INFO >> --log-file=/var/log/glusterfs/geo-replication-slaves/storage_10.0.231.81_pcic-backup/mnt-10.0.231.93-data-storage_b-storage.log >> --volfile-server=localhost --volf >> ile-id=pcic-backup --client-pid=-1 >> /var/mountbroker-root/user1000/mtpt-geoaccount-GmVoUI}] >> [2021-03-11 19:15:28.253485] I [glusterfsd.c:2424:daemonize] >> 0-glusterfs: Pid of current running process is 157484 >> [2021-03-11 19:15:28.267911] I [MSGID: 101190] >> [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: >> Started thread with index [{index=0}] >> [2021-03-11 19:15:28.267984] I [MSGID: 101190] >> [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: >> Started thread with index [{index=1}] >> [2021-03-11 19:15:28.268371] I >> [glusterfsd-mgmt.c:2170:mgmt_getspec_cbk] 0-glusterfs: >> Received list of available volfile servers: 10.0.231.82:24007 >> [2021-03-11 19:15:28.271729] I [MSGID: 101190] >> [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: >> Started thread with index [{index=2}] >> [2021-03-11 19:15:28.271762] I [MSGID: 101190] >> [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: >> Started thread with index [{index=3}] >> [2021-03-11 19:15:28.272223] I [MSGID: 114020] >> [client.c:2315:notify] 0-pcic-backup-client-0: parent >> translators are ready, attempting connect on transport [] >> [2021-03-11 19:15:28.275883] I [MSGID: 114020] >> [client.c:2315:notify] 0-pcic-backup-client-1: parent >> translators are ready, attempting connect on transport [] >> [2021-03-11 19:15:28.276154] I >> [rpc-clnt.c:1975:rpc_clnt_reconfig] 0-pcic-backup-client-0: >> changing port to 49153 (from 0) >> [2021-03-11 19:15:28.276193] I >> [socket.c:849:__socket_shutdown] 0-pcic-backup-client-0: >> intentional socket shutdown(13) >> Final graph: >> ... >> +------------------------------------------------------------------------------+ >> [2021-03-11 19:15:28.282144] I >> [socket.c:849:__socket_shutdown] 0-pcic-backup-client-1: >> intentional socket shutdown(15) >> [2021-03-11 19:15:28.286536] I [MSGID: 114057] >> [client-handshake.c:1128:select_server_supported_programs] >> 0-pcic-backup-client-0: Using Program >> [{Program-name=GlusterFS 4.x v1}, {Num=1298437}, {Version=400}] >> [2021-03-11 19:15:28.287208] I [MSGID: 114046] >> [client-handshake.c:857:client_setvolume_cbk] >> 0-pcic-backup-client-0: Connected, attached to remote volume >> [{conn-name=pcic-backup-client-0}, {remote_subvol=/data/brick}] >> [2021-03-11 19:15:28.290162] I [MSGID: 114057] >> [client-handshake.c:1128:select_server_supported_programs] >> 0-pcic-backup-client-1: Using Program >> [{Program-name=GlusterFS 4.x v1}, {Num=1298437}, {Version=400}] >> [2021-03-11 19:15:28.291122] I [MSGID: 114046] >> [client-handshake.c:857:client_setvolume_cbk] >> 0-pcic-backup-client-1: Connected, attached to remote volume >> [{conn-name=pcic-backup-client-1}, {remote_subvol=/data/brick}] >> [2021-03-11 19:15:28.292703] I [fuse-bridge.c:5300:fuse_init] >> 0-glusterfs-fuse: FUSE inited with protocol versions: >> glusterfs 7.24 kernel 7.23 >> [2021-03-11 19:15:28.292730] I >> [fuse-bridge.c:5926:fuse_graph_sync] 0-fuse: switched to graph 0 >> [2021-03-11 19:15:32.809518] I >> [fuse-bridge.c:6242:fuse_thread_proc] 0-fuse: initiating >> unmount of /var/mountbroker-root/user1000/mtpt-geoaccount-GmVoUI >> [2021-03-11 19:15:32.810216] W >> [glusterfsd.c:1439:cleanup_and_exit] >> (-->/lib64/libpthread.so.0(+0x7ea5) [0x7ff56b175ea5] >> -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) >> [0x55664e67db45] >> -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) >> [0x55664e67d9ab] ) 0-: received signum (15), shutting down >> [2021-03-11 19:15:32.810253] I [fuse-bridge.c:7074:fini] >> 0-fuse: Unmounting >> '/var/mountbroker-root/user1000/mtpt-geoaccount-GmVoUI'. >> [2021-03-11 19:15:32.810268] I [fuse-bridge.c:7079:fini] >> 0-fuse: Closing fuse connection to >> '/var/mountbroker-root/user1000/mtpt-geoaccount-GmVoUI'. >> >> >> I'm really at a loss for where to go from here, it seems like >> everything is set up correctly, and it has been working well >> through the 7.x minor versions, but the jump to 8 has broken >> something... >> >> There definitely are lots of changelogs on the servers that >> fit into the timeframe..... I haven't made any writes to the >> source volume.... do you think that's the problem? That it >> needs some new changelog info to sync? >> I had been holding off making any writes in case I needed to >> go back to Gluster7.9 - not sure if that's really a good >> option or not. >> >> [root at storage01 changelogs]# for dirs in {a,b,c}; do echo >> "/data/storage_$dirs/storage/.glusterfs/changelogs"; ls -lh >> /data/storage_$dirs/storage/.glusterfs/changelogs | head; >> echo ""; done >> /data/storage_a/storage/.glusterfs/changelogs >> total 16G >> drw-------. 3 root root?? 24 Mar? 9 11:34 2021 >> -rw-r--r--. 1 root root?? 51 Mar 12 09:50 CHANGELOG >> -rw-r--r--. 1 root root? 13K Aug 13? 2020 CHANGELOG.1597343197 >> -rw-r--r--. 1 root root? 51K Aug 13? 2020 CHANGELOG.1597343212 >> -rw-r--r--. 1 root root? 86K Aug 13? 2020 CHANGELOG.1597343227 >> -rw-r--r--. 1 root root? 99K Aug 13? 2020 CHANGELOG.1597343242 >> -rw-r--r--. 1 root root? 69K Aug 13? 2020 CHANGELOG.1597343257 >> -rw-r--r--. 1 root root? 69K Aug 13? 2020 CHANGELOG.1597343272 >> -rw-r--r--. 1 root root? 72K Aug 13? 2020 CHANGELOG.1597343287 >> >> /data/storage_b/storage/.glusterfs/changelogs >> total 3.3G >> drw-------. 3 root root?? 24 Mar? 9 11:34 2021 >> -rw-r--r--. 1 root root?? 51 Mar 12 09:50 CHANGELOG >> -rw-r--r--. 1 root root? 13K Aug 13? 2020 CHANGELOG.1597343197 >> -rw-r--r--. 1 root root? 53K Aug 13? 2020 CHANGELOG.1597343212 >> -rw-r--r--. 1 root root? 89K Aug 13? 2020 CHANGELOG.1597343227 >> -rw-r--r--. 1 root root? 89K Aug 13? 2020 CHANGELOG.1597343242 >> -rw-r--r--. 1 root root? 69K Aug 13? 2020 CHANGELOG.1597343257 >> -rw-r--r--. 1 root root? 71K Aug 13? 2020 CHANGELOG.1597343272 >> -rw-r--r--. 1 root root? 86K Aug 13? 2020 CHANGELOG.1597343287 >> >> /data/storage_c/storage/.glusterfs/changelogs >> total 9.6G >> drw-------. 3 root root?? 16 Mar? 9 11:34 2021 >> -rw-r--r--. 1 root root?? 51 Mar 12 09:50 CHANGELOG >> -rw-r--r--. 1 root root? 16K Aug 13? 2020 CHANGELOG.1597343199 >> -rw-r--r--. 1 root root? 71K Aug 13? 2020 CHANGELOG.1597343214 >> -rw-r--r--. 1 root root 122K Aug 13? 2020 CHANGELOG.1597343229 >> -rw-r--r--. 1 root root? 73K Aug 13? 2020 CHANGELOG.1597343244 >> -rw-r--r--. 1 root root 100K Aug 13? 2020 CHANGELOG.1597343259 >> -rw-r--r--. 1 root root? 95K Aug 13? 2020 CHANGELOG.1597343274 >> -rw-r--r--. 1 root root? 92K Aug 13? 2020 CHANGELOG.1597343289 >> >> [root at storage01 changelogs]# for dirs in {a,b,c}; do echo >> "/data/storage_$dirs/storage/.glusterfs/changelogs"; ls -lh >> /data/storage_$dirs/storage/.glusterfs/changelogs | tail; >> echo ""; done >> /data/storage_a/storage/.glusterfs/changelogs >> -rw-r--r--. 1 root root?? 92 Mar? 1 21:33 CHANGELOG.1614663193 >> -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663731 >> -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663760 >> -rw-r--r--. 1 root root? 511 Mar? 1 21:47 CHANGELOG.1614664043 >> -rw-r--r--. 1 root root? 536 Mar? 1 21:48 CHANGELOG.1614664101 >> -rw-r--r--. 1 root root 2.8K Mar? 1 21:48 CHANGELOG.1614664116 >> -rw-r--r--. 1 root root?? 92 Mar? 1 22:20 CHANGELOG.1614666061 >> -rw-r--r--. 1 root root?? 92 Mar? 1 22:29 CHANGELOG.1614666554 >> drw-------. 2 root root?? 10 May? 7? 2020 csnap >> drw-------. 2 root root?? 38 Aug 13? 2020 htime >> >> /data/storage_b/storage/.glusterfs/changelogs >> -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663731 >> -rw-r--r--. 1 root root? 480 Mar? 1 21:42 CHANGELOG.1614663745 >> -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663760 >> -rw-r--r--. 1 root root? 524 Mar? 1 21:47 CHANGELOG.1614664043 >> -rw-r--r--. 1 root root? 495 Mar? 1 21:48 CHANGELOG.1614664100 >> -rw-r--r--. 1 root root 1.6K Mar? 1 21:48 CHANGELOG.1614664114 >> -rw-r--r--. 1 root root?? 92 Mar? 1 22:20 CHANGELOG.1614666060 >> -rw-r--r--. 1 root root?? 92 Mar? 1 22:29 CHANGELOG.1614666553 >> drw-------. 2 root root?? 10 May? 7? 2020 csnap >> drw-------. 2 root root?? 38 Aug 13? 2020 htime >> >> /data/storage_c/storage/.glusterfs/changelogs >> -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663738 >> -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663753 >> -rw-r--r--. 1 root root? 395 Mar? 1 21:47 CHANGELOG.1614664051 >> -rw-r--r--. 1 root root? 316 Mar? 1 21:48 CHANGELOG.1614664094 >> -rw-r--r--. 1 root root 1.2K Mar? 1 21:48 CHANGELOG.1614664109 >> -rw-r--r--. 1 root root? 174 Mar? 1 21:48 CHANGELOG.1614664123 >> -rw-r--r--. 1 root root?? 92 Mar? 1 22:20 CHANGELOG.1614666061 >> -rw-r--r--. 1 root root?? 92 Mar? 1 22:29 CHANGELOG.1614666553 >> drw-------. 2 root root??? 6 May? 7? 2020 csnap >> drw-------. 2 root root?? 30 Aug 13? 2020 htime >> >> [root at storage02 ~]# for dirs in {a,b,c}; do echo >> "/data/storage_$dirs/storage/.glusterfs/changelogs"; ls -lh >> /data/storage_$dirs/storage/.glusterfs/changelogs | head; >> echo ""; done >> /data/storage_a/storage/.glusterfs/changelogs >> total 9.6G >> drw-------. 3 root root?? 24 Mar? 9 11:34 2021 >> -rw-r--r--. 1 root root?? 51 Mar 12 09:50 CHANGELOG >> -rw-r--r--. 1 root root 4.2K Aug 13? 2020 CHANGELOG.1597343193 >> -rw-r--r--. 1 root root? 32K Aug 13? 2020 CHANGELOG.1597343208 >> -rw-r--r--. 1 root root 107K Aug 13? 2020 CHANGELOG.1597343223 >> -rw-r--r--. 1 root root 120K Aug 13? 2020 CHANGELOG.1597343238 >> -rw-r--r--. 1 root root? 72K Aug 13? 2020 CHANGELOG.1597343253 >> -rw-r--r--. 1 root root 111K Aug 13? 2020 CHANGELOG.1597343268 >> -rw-r--r--. 1 root root? 91K Aug 13? 2020 CHANGELOG.1597343283 >> >> /data/storage_b/storage/.glusterfs/changelogs >> total 16G >> drw-------. 3 root root?? 24 Mar? 9 11:34 2021 >> -rw-r--r--. 1 root root?? 51 Mar 12 09:50 CHANGELOG >> -rw-r--r--. 1 root root 3.9K Aug 13? 2020 CHANGELOG.1597343193 >> -rw-r--r--. 1 root root? 35K Aug 13? 2020 CHANGELOG.1597343208 >> -rw-r--r--. 1 root root? 85K Aug 13? 2020 CHANGELOG.1597343223 >> -rw-r--r--. 1 root root 103K Aug 13? 2020 CHANGELOG.1597343238 >> -rw-r--r--. 1 root root? 70K Aug 13? 2020 CHANGELOG.1597343253 >> -rw-r--r--. 1 root root? 72K Aug 13? 2020 CHANGELOG.1597343268 >> -rw-r--r--. 1 root root? 73K Aug 13? 2020 CHANGELOG.1597343283 >> >> /data/storage_c/storage/.glusterfs/changelogs >> total 3.3G >> drw-------. 3 root root?? 16 Mar? 9 11:34 2021 >> -rw-r--r--. 1 root root?? 51 Mar 12 09:51 CHANGELOG >> -rw-r--r--. 1 root root? 21K Aug 13? 2020 CHANGELOG.1597343202 >> -rw-r--r--. 1 root root? 75K Aug 13? 2020 CHANGELOG.1597343217 >> -rw-r--r--. 1 root root? 92K Aug 13? 2020 CHANGELOG.1597343232 >> -rw-r--r--. 1 root root? 77K Aug 13? 2020 CHANGELOG.1597343247 >> -rw-r--r--. 1 root root? 66K Aug 13? 2020 CHANGELOG.1597343262 >> -rw-r--r--. 1 root root? 84K Aug 13? 2020 CHANGELOG.1597343277 >> -rw-r--r--. 1 root root? 81K Aug 13? 2020 CHANGELOG.1597343292 >> >> [root at storage02 ~]# for dirs in {a,b,c}; do echo >> "/data/storage_$dirs/storage/.glusterfs/changelogs"; ls -lh >> /data/storage_$dirs/storage/.glusterfs/changelogs | tail; >> echo ""; done >> /data/storage_a/storage/.glusterfs/changelogs >> -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663734 >> -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663749 >> -rw-r--r--. 1 root root? 395 Mar? 1 21:47 CHANGELOG.1614664052 >> -rw-r--r--. 1 root root? 316 Mar? 1 21:48 CHANGELOG.1614664096 >> -rw-r--r--. 1 root root 1.2K Mar? 1 21:48 CHANGELOG.1614664111 >> -rw-r--r--. 1 root root? 174 Mar? 1 21:48 CHANGELOG.1614664126 >> -rw-r--r--. 1 root root?? 92 Mar? 1 22:20 CHANGELOG.1614666056 >> -rw-r--r--. 1 root root?? 92 Mar? 1 22:29 CHANGELOG.1614666560 >> drw-------. 2 root root?? 10 May? 7? 2020 csnap >> drw-------. 2 root root?? 38 Aug 13? 2020 htime >> >> /data/storage_b/storage/.glusterfs/changelogs >> -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663735 >> -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663749 >> -rw-r--r--. 1 root root? 511 Mar? 1 21:47 CHANGELOG.1614664052 >> -rw-r--r--. 1 root root? 316 Mar? 1 21:48 CHANGELOG.1614664096 >> -rw-r--r--. 1 root root 1.8K Mar? 1 21:48 CHANGELOG.1614664111 >> -rw-r--r--. 1 root root 1.4K Mar? 1 21:48 CHANGELOG.1614664126 >> -rw-r--r--. 1 root root?? 92 Mar? 1 22:20 CHANGELOG.1614666060 >> -rw-r--r--. 1 root root?? 92 Mar? 1 22:29 CHANGELOG.1614666556 >> drw-------. 2 root root?? 10 May? 7? 2020 csnap >> drw-------. 2 root root?? 38 Aug 13? 2020 htime >> >> /data/storage_c/storage/.glusterfs/changelogs >> -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663738 >> -rw-r--r--. 1 root root? 521 Mar? 1 21:42 CHANGELOG.1614663752 >> -rw-r--r--. 1 root root? 524 Mar? 1 21:47 CHANGELOG.1614664042 >> -rw-r--r--. 1 root root?? 92 Mar? 1 21:47 CHANGELOG.1614664057 >> -rw-r--r--. 1 root root? 536 Mar? 1 21:48 CHANGELOG.1614664102 >> -rw-r--r--. 1 root root 1.6K Mar? 1 21:48 CHANGELOG.1614664117 >> -rw-r--r--. 1 root root?? 92 Mar? 1 22:20 CHANGELOG.1614666057 >> -rw-r--r--. 1 root root?? 92 Mar? 1 22:29 CHANGELOG.1614666550 >> drw-------. 2 root root??? 6 May? 7? 2020 csnap >> drw-------. 2 root root?? 30 Aug 13? 2020 htime >> >> >> [root at storage03 ~]# for dirs in {a,b,c}; do echo >> "/data/storage_$dirs/storage/.glusterfs/changelogs"; ls -lh >> /data/storage_$dirs/storage/.glusterfs/changelogs | head; >> echo ""; done >> /data/storage_a/storage/.glusterfs/changelogs >> total 3.4G >> drw-------. 3 root root?? 24 Mar? 9 11:34 2021 >> -rw-r--r--. 1 root root?? 51 Mar 12 09:50 CHANGELOG >> -rw-r--r--. 1 root root? 19K Aug 13? 2020 CHANGELOG.1597343201 >> -rw-r--r--. 1 root root? 66K Aug 13? 2020 CHANGELOG.1597343215 >> -rw-r--r--. 1 root root? 91K Aug 13? 2020 CHANGELOG.1597343230 >> -rw-r--r--. 1 root root? 82K Aug 13? 2020 CHANGELOG.1597343245 >> -rw-r--r--. 1 root root? 64K Aug 13? 2020 CHANGELOG.1597343259 >> -rw-r--r--. 1 root root? 75K Aug 13? 2020 CHANGELOG.1597343274 >> -rw-r--r--. 1 root root? 81K Aug 13? 2020 CHANGELOG.1597343289 >> >> /data/storage_b/storage/.glusterfs/changelogs >> total 9.6G >> drw-------. 3 root root?? 24 Mar? 9 11:34 2021 >> -rw-r--r--. 1 root root?? 51 Mar 12 09:51 CHANGELOG >> -rw-r--r--. 1 root root? 19K Aug 13? 2020 CHANGELOG.1597343201 >> -rw-r--r--. 1 root root? 80K Aug 13? 2020 CHANGELOG.1597343215 >> -rw-r--r--. 1 root root 119K Aug 13? 2020 CHANGELOG.1597343230 >> -rw-r--r--. 1 root root? 65K Aug 13? 2020 CHANGELOG.1597343244 >> -rw-r--r--. 1 root root 100K Aug 13? 2020 CHANGELOG.1597343259 >> -rw-r--r--. 1 root root? 95K Aug 13? 2020 CHANGELOG.1597343274 >> -rw-r--r--. 1 root root? 92K Aug 13? 2020 CHANGELOG.1597343289 >> >> /data/storage_c/storage/.glusterfs/changelogs >> total 16G >> drw-------. 3 root root?? 16 Mar? 9 11:34 2021 >> -rw-r--r--. 1 root root?? 51 Mar 12 09:51 CHANGELOG >> -rw-r--r--. 1 root root 3.9K Aug 13? 2020 CHANGELOG.1597343193 >> -rw-r--r--. 1 root root? 35K Aug 13? 2020 CHANGELOG.1597343208 >> -rw-r--r--. 1 root root? 85K Aug 13? 2020 CHANGELOG.1597343223 >> -rw-r--r--. 1 root root 103K Aug 13? 2020 CHANGELOG.1597343238 >> -rw-r--r--. 1 root root? 70K Aug 13? 2020 CHANGELOG.1597343253 >> -rw-r--r--. 1 root root? 71K Aug 13? 2020 CHANGELOG.1597343268 >> -rw-r--r--. 1 root root? 73K Aug 13? 2020 CHANGELOG.1597343283 >> >> [root at storage03 ~]# for dirs in {a,b,c}; do echo >> "/data/storage_$dirs/storage/.glusterfs/changelogs"; ls -lh >> /data/storage_$dirs/storage/.glusterfs/changelogs | tail; >> echo ""; done >> /data/storage_a/storage/.glusterfs/changelogs >> -rw-r--r--. 1 root root?? 92 Mar? 1 21:33 CHANGELOG.1614663183 >> -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663740 >> -rw-r--r--. 1 root root? 521 Mar? 1 21:42 CHANGELOG.1614663755 >> -rw-r--r--. 1 root root? 524 Mar? 1 21:47 CHANGELOG.1614664049 >> -rw-r--r--. 1 root root 1.9K Mar? 1 21:48 CHANGELOG.1614664106 >> -rw-r--r--. 1 root root? 174 Mar? 1 21:48 CHANGELOG.1614664121 >> -rw-r--r--. 1 root root?? 92 Mar? 1 22:20 CHANGELOG.1614666051 >> -rw-r--r--. 1 root root?? 92 Mar? 1 22:29 CHANGELOG.1614666559 >> drw-------. 2 root root?? 10 May? 7? 2020 csnap >> drw-------. 2 root root?? 38 Aug 13? 2020 htime >> >> /data/storage_b/storage/.glusterfs/changelogs >> -rw-r--r--. 1 root root? 474 Mar? 1 21:33 CHANGELOG.1614663182 >> -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663739 >> -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663753 >> -rw-r--r--. 1 root root? 395 Mar? 1 21:47 CHANGELOG.1614664049 >> -rw-r--r--. 1 root root 1.4K Mar? 1 21:48 CHANGELOG.1614664106 >> -rw-r--r--. 1 root root? 174 Mar? 1 21:48 CHANGELOG.1614664120 >> -rw-r--r--. 1 root root?? 92 Mar? 1 22:20 CHANGELOG.1614666063 >> -rw-r--r--. 1 root root?? 92 Mar? 1 22:29 CHANGELOG.1614666557 >> drw-------. 2 root root?? 10 May? 7? 2020 csnap >> drw-------. 2 root root?? 38 Aug 13? 2020 htime >> >> /data/storage_c/storage/.glusterfs/changelogs >> -rw-r--r--. 1 root root? 468 Mar? 1 21:33 CHANGELOG.1614663183 >> -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663740 >> -rw-r--r--. 1 root root?? 92 Mar? 1 21:42 CHANGELOG.1614663754 >> -rw-r--r--. 1 root root? 511 Mar? 1 21:47 CHANGELOG.1614664048 >> -rw-r--r--. 1 root root 2.0K Mar? 1 21:48 CHANGELOG.1614664105 >> -rw-r--r--. 1 root root 1.4K Mar? 1 21:48 CHANGELOG.1614664120 >> -rw-r--r--. 1 root root?? 92 Mar? 1 22:20 CHANGELOG.1614666063 >> -rw-r--r--. 1 root root?? 92 Mar? 1 22:29 CHANGELOG.1614666556 >> drw-------. 2 root root??? 6 May? 7? 2020 csnap >> drw-------. 2 root root?? 30 Aug 13? 2020 htime >> >> Thanks, >> ?-Matthew >> >> -- >> Matthew Benstead >> System Administrator >> Pacific Climate Impacts Consortium <https://pacificclimate.org/> >> University of Victoria, UH1 >> PO Box 1800, STN CSC >> Victoria, BC, V8W 2Y2 >> Phone: +1-250-721-8432 >> Email: matthewb at uvic.ca <mailto:matthewb at uvic.ca> >> >> On 3/11/21 11:37 PM, Strahil Nikolov wrote: >>> Notice: This message was sent from outside the University of >>> Victoria email system. Please be cautious with links and >>> sensitive information. >>> >>> Have you checked the secondary volume nodes' logs & SELINUX >>> status ? >>> >>> Best Regards, >>> Strahil Nikolov >>> >>> On Thu, Mar 11, 2021 at 21:36, Matthew Benstead >>> <matthewb at uvic.ca> <mailto:matthewb at uvic.ca> wrote: >>> Hi Strahil, >>> >>> It looks like perhaps the changelog_log_level and >>> log_level options? I've set them to debug: >>> >>> [root at storage01 ~]# gluster volume geo-replication >>> storage geoaccount at 10.0.231.81::pcic-backup >>> <mailto:geoaccount at 10.0.231.81::pcic-backup> config | >>> egrep -i "log_level" >>> changelog_log_level:INFO >>> cli_log_level:INFO >>> gluster_log_level:INFO >>> log_level:INFO >>> slave_gluster_log_level:INFO >>> slave_log_level:INFO >>> >>> [root at storage01 ~]# gluster volume geo-replication >>> storage geoaccount at 10.0.231.81::pcic-backup >>> <mailto:geoaccount at 10.0.231.81::pcic-backup> config >>> changelog_log_level DEBUG >>> geo-replication config updated successfully >>> >>> [root at storage01 ~]# gluster volume geo-replication >>> storage geoaccount at 10.0.231.81::pcic-backup >>> <mailto:geoaccount at 10.0.231.81::pcic-backup> config >>> log_level DEBUG >>> geo-replication config updated successfully >>> >>> >>> Then I restarted geo-replication: >>> >>> [root at storage01 ~]# gluster volume geo-replication >>> storage geoaccount at 10.0.231.81::pcic-backup >>> <mailto:geoaccount at 10.0.231.81::pcic-backup> start >>> Starting geo-replication session between storage & >>> geoaccount at 10.0.231.81::pcic-backup >>> <mailto:geoaccount at 10.0.231.81::pcic-backup> has been >>> successful >>> [root at storage01 ~]# gluster volume geo-replication status >>> ? >>> MASTER NODE??? MASTER VOL??? MASTER BRICK?????????????? >>> SLAVE USER??? >>> SLAVE??????????????????????????????????????? SLAVE >>> NODE??? STATUS???????????? CRAWL STATUS??? >>> LAST_SYNCED????????? >>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>> 10.0.231.91??? storage?????? /data/storage_a/storage??? >>> geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup >>> <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? >>> N/A?????????? Initializing...??? N/A???????????? >>> N/A????????????????? >>> 10.0.231.91??? storage?????? /data/storage_c/storage??? >>> geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup >>> <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? >>> N/A?????????? Initializing...??? N/A???????????? >>> N/A????????????????? >>> 10.0.231.91??? storage?????? /data/storage_b/storage??? >>> geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup >>> <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? >>> N/A?????????? Initializing...??? N/A???????????? >>> N/A????????????????? >>> 10.0.231.92??? storage?????? /data/storage_b/storage??? >>> geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup >>> <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? >>> N/A?????????? Initializing...??? N/A???????????? >>> N/A????????????????? >>> 10.0.231.92??? storage?????? /data/storage_a/storage??? >>> geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup >>> <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? >>> N/A?????????? Initializing...??? N/A???????????? >>> N/A????????????????? >>> 10.0.231.92??? storage?????? /data/storage_c/storage??? >>> geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup >>> <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? >>> N/A?????????? Initializing...??? N/A???????????? >>> N/A????????????????? >>> 10.0.231.93??? storage?????? /data/storage_c/storage??? >>> geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup >>> <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? >>> N/A?????????? Initializing...??? N/A???????????? >>> N/A????????????????? >>> 10.0.231.93??? storage?????? /data/storage_b/storage??? >>> geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup >>> <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? >>> N/A?????????? Initializing...??? N/A???????????? >>> N/A????????????????? >>> 10.0.231.93??? storage?????? /data/storage_a/storage??? >>> geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup >>> <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? >>> N/A?????????? Initializing...??? N/A???????????? >>> N/A????????????????? >>> [root at storage01 ~]# gluster volume geo-replication status >>> ? >>> MASTER NODE??? MASTER VOL??? MASTER BRICK?????????????? >>> SLAVE USER??? >>> SLAVE??????????????????????????????????????? SLAVE >>> NODE??? STATUS??? CRAWL STATUS??? LAST_SYNCED????????? >>> --------------------------------------------------------------------------------------------------------------------------------------------------------------------- >>> 10.0.231.91??? storage?????? /data/storage_a/storage??? >>> geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup >>> <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? >>> N/A?????????? Faulty??? N/A???????????? >>> N/A????????????????? >>> 10.0.231.91??? storage?????? /data/storage_c/storage??? >>> geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup >>> <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? >>> N/A?????????? Faulty??? N/A???????????? >>> N/A????????????????? >>> 10.0.231.91??? storage?????? /data/storage_b/storage??? >>> geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup >>> <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? >>> N/A?????????? Faulty??? N/A???????????? >>> N/A????????????????? >>> 10.0.231.92??? storage?????? /data/storage_b/storage??? >>> geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup >>> <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? >>> N/A?????????? Faulty??? N/A???????????? >>> N/A????????????????? >>> 10.0.231.92??? storage?????? /data/storage_a/storage??? >>> geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup >>> <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? >>> N/A?????????? Faulty??? N/A???????????? >>> N/A????????????????? >>> 10.0.231.92??? storage?????? /data/storage_c/storage??? >>> geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup >>> <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? >>> N/A?????????? Faulty??? N/A???????????? >>> N/A????????????????? >>> 10.0.231.93??? storage?????? /data/storage_c/storage??? >>> geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup >>> <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? >>> N/A?????????? Faulty??? N/A???????????? >>> N/A????????????????? >>> 10.0.231.93??? storage?????? /data/storage_b/storage??? >>> geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup >>> <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? >>> N/A?????????? Faulty??? N/A???????????? >>> N/A????????????????? >>> 10.0.231.93??? storage?????? /data/storage_a/storage??? >>> geoaccount??? ssh://geoaccount at 10.0.231.81::pcic-backup >>> <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup>??? >>> N/A?????????? Faulty??? N/A???????????? N/A? >>> >>> [root at storage01 ~]# gluster volume geo-replication >>> storage geoaccount at 10.0.231.81::pcic-backup >>> <mailto:geoaccount at 10.0.231.81::pcic-backup> stop >>> Stopping geo-replication session between storage & >>> geoaccount at 10.0.231.81::pcic-backup >>> <mailto:geoaccount at 10.0.231.81::pcic-backup> has been >>> successful >>> >>> >>> The changelogs didn't really show anything new around >>> changelog selection: >>> >>> [root at storage01 storage_10.0.231.81_pcic-backup]# cat >>> changes-data-storage_a-storage.log | egrep "2021-03-11" >>> [2021-03-11 19:15:30.552889] I [MSGID: 132028] >>> [gf-changelog.c:577:gf_changelog_register_generic] >>> 0-gfchangelog: Registering brick >>> [{brick=/data/storage_a/storage}, {notify_filter=1}] >>> [2021-03-11 19:15:30.552893] I [MSGID: 101190] >>> [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: >>> Started thread with index [{index=0}] >>> [2021-03-11 19:15:30.552894] I [MSGID: 101190] >>> [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: >>> Started thread with index [{index=1}] >>> [2021-03-11 19:15:30.553633] I [MSGID: 101190] >>> [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: >>> Started thread with index [{index=3}] >>> [2021-03-11 19:15:30.553634] I [MSGID: 101190] >>> [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: >>> Started thread with index [{index=2}] >>> [2021-03-11 19:15:30.554236] D >>> [rpcsvc.c:2831:rpcsvc_init] 0-rpc-service: RPC service >>> inited. >>> [2021-03-11 19:15:30.554403] D >>> [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: >>> New program registered: GF-DUMP, Num: 123451501, Ver: 1, >>> Port: 0 >>> [2021-03-11 19:15:30.554420] D >>> [rpc-transport.c:278:rpc_transport_load] >>> 0-rpc-transport: attempt to load file >>> /usr/lib64/glusterfs/8.3/rpc-transport/socket.so >>> [2021-03-11 19:15:30.554933] D >>> [socket.c:4485:socket_init] 0-socket.gfchangelog: >>> disabling nodelay >>> [2021-03-11 19:15:30.554944] D >>> [socket.c:4523:socket_init] 0-socket.gfchangelog: >>> Configured transport.tcp-user-timeout=42 >>> [2021-03-11 19:15:30.554949] D >>> [socket.c:4543:socket_init] 0-socket.gfchangelog: >>> Reconfigured transport.keepalivecnt=9 >>> [2021-03-11 19:15:30.555002] I >>> [socket.c:929:__socket_server_bind] >>> 0-socket.gfchangelog: closing (AF_UNIX) reuse check >>> socket 23 >>> [2021-03-11 19:15:30.555324] D >>> [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: >>> New program registered: LIBGFCHANGELOG REBORP, Num: >>> 1886350951, Ver: 1, Port: 0 >>> [2021-03-11 19:15:30.555345] D >>> [rpc-clnt.c:1020:rpc_clnt_connection_init] >>> 0-gfchangelog: defaulting frame-timeout to 30mins >>> [2021-03-11 19:15:30.555351] D >>> [rpc-clnt.c:1032:rpc_clnt_connection_init] >>> 0-gfchangelog: disable ping-timeout >>> [2021-03-11 19:15:30.555358] D >>> [rpc-transport.c:278:rpc_transport_load] >>> 0-rpc-transport: attempt to load file >>> /usr/lib64/glusterfs/8.3/rpc-transport/socket.so >>> [2021-03-11 19:15:30.555399] D >>> [socket.c:4485:socket_init] 0-gfchangelog: disabling nodelay >>> [2021-03-11 19:15:30.555406] D >>> [socket.c:4523:socket_init] 0-gfchangelog: Configured >>> transport.tcp-user-timeout=42 >>> [2021-03-11 19:15:32.555711] D >>> [rpc-clnt-ping.c:298:rpc_clnt_start_ping] 0-gfchangelog: >>> ping timeout is 0, returning >>> [2021-03-11 19:15:32.572157] I [MSGID: 132035] >>> [gf-history-changelog.c:837:gf_history_changelog] >>> 0-gfchangelog: Requesting historical changelogs >>> [{start=1614666553}, {end=1615490132}] >>> [2021-03-11 19:15:32.572436] I [MSGID: 132019] >>> [gf-history-changelog.c:755:gf_changelog_extract_min_max] >>> 0-gfchangelog: changelogs min max [{min=1597342860}, >>> {max=1615490121}, {total_changelogs=1256897}] >>> [2021-03-11 19:15:32.621244] E [MSGID: 132009] >>> [gf-history-changelog.c:941:gf_history_changelog] >>> 0-gfchangelog: wrong result [{for=end}, >>> {start=1615490121}, {idx=1256896}] >>> [2021-03-11 19:15:46.733182] I [MSGID: 132028] >>> [gf-changelog.c:577:gf_changelog_register_generic] >>> 0-gfchangelog: Registering brick >>> [{brick=/data/storage_a/storage}, {notify_filter=1}] >>> [2021-03-11 19:15:46.733316] I [MSGID: 101190] >>> [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: >>> Started thread with index [{index=0}] >>> [2021-03-11 19:15:46.733348] I [MSGID: 101190] >>> [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: >>> Started thread with index [{index=1}] >>> [2021-03-11 19:15:46.734031] I [MSGID: 101190] >>> [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: >>> Started thread with index [{index=2}] >>> [2021-03-11 19:15:46.734085] I [MSGID: 101190] >>> [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: >>> Started thread with index [{index=3}] >>> [2021-03-11 19:15:46.734591] D >>> [rpcsvc.c:2831:rpcsvc_init] 0-rpc-service: RPC service >>> inited. >>> [2021-03-11 19:15:46.734755] D >>> [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: >>> New program registered: GF-DUMP, Num: 123451501, Ver: 1, >>> Port: 0 >>> [2021-03-11 19:15:46.734772] D >>> [rpc-transport.c:278:rpc_transport_load] >>> 0-rpc-transport: attempt to load file >>> /usr/lib64/glusterfs/8.3/rpc-transport/socket.so >>> [2021-03-11 19:15:46.735256] D >>> [socket.c:4485:socket_init] 0-socket.gfchangelog: >>> disabling nodelay >>> [2021-03-11 19:15:46.735266] D >>> [socket.c:4523:socket_init] 0-socket.gfchangelog: >>> Configured transport.tcp-user-timeout=42 >>> [2021-03-11 19:15:46.735271] D >>> [socket.c:4543:socket_init] 0-socket.gfchangelog: >>> Reconfigured transport.keepalivecnt=9 >>> [2021-03-11 19:15:46.735325] I >>> [socket.c:929:__socket_server_bind] >>> 0-socket.gfchangelog: closing (AF_UNIX) reuse check >>> socket 21 >>> [2021-03-11 19:15:46.735704] D >>> [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: >>> New program registered: LIBGFCHANGELOG REBORP, Num: >>> 1886350951, Ver: 1, Port: 0 >>> [2021-03-11 19:15:46.735721] D >>> [rpc-clnt.c:1020:rpc_clnt_connection_init] >>> 0-gfchangelog: defaulting frame-timeout to 30mins >>> [2021-03-11 19:15:46.735726] D >>> [rpc-clnt.c:1032:rpc_clnt_connection_init] >>> 0-gfchangelog: disable ping-timeout >>> [2021-03-11 19:15:46.735733] D >>> [rpc-transport.c:278:rpc_transport_load] >>> 0-rpc-transport: attempt to load file >>> /usr/lib64/glusterfs/8.3/rpc-transport/socket.so >>> [2021-03-11 19:15:46.735771] D >>> [socket.c:4485:socket_init] 0-gfchangelog: disabling nodelay >>> [2021-03-11 19:15:46.735778] D >>> [socket.c:4523:socket_init] 0-gfchangelog: Configured >>> transport.tcp-user-timeout=42 >>> [2021-03-11 19:15:47.618464] D >>> [rpc-clnt-ping.c:298:rpc_clnt_start_ping] 0-gfchangelog: >>> ping timeout is 0, returning >>> >>> >>> [root at storage01 storage_10.0.231.81_pcic-backup]# cat >>> changes-data-storage_b-storage.log | egrep "2021-03-11" >>> [2021-03-11 19:15:30.611457] I [MSGID: 132028] >>> [gf-changelog.c:577:gf_changelog_register_generic] >>> 0-gfchangelog: Registering brick >>> [{brick=/data/storage_b/storage}, {notify_filter=1}] >>> [2021-03-11 19:15:30.611574] I [MSGID: 101190] >>> [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: >>> Started thread with index [{index=1}] >>> [2021-03-11 19:15:30.611641] I [MSGID: 101190] >>> [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: >>> Started thread with index [{index=3}] >>> [2021-03-11 19:15:30.611645] I [MSGID: 101190] >>> [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: >>> Started thread with index [{index=2}] >>> [2021-03-11 19:15:30.612325] D >>> [rpcsvc.c:2831:rpcsvc_init] 0-rpc-service: RPC service >>> inited. >>> [2021-03-11 19:15:30.612488] D >>> [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: >>> New program registered: GF-DUMP, Num: 123451501, Ver: 1, >>> Port: 0 >>> [2021-03-11 19:15:30.612507] D >>> [rpc-transport.c:278:rpc_transport_load] >>> 0-rpc-transport: attempt to load file >>> /usr/lib64/glusterfs/8.3/rpc-transport/socket.so >>> [2021-03-11 19:15:30.613005] D >>> [socket.c:4485:socket_init] 0-socket.gfchangelog: >>> disabling nodelay >>> [2021-03-11 19:15:30.613130] D >>> [socket.c:4523:socket_init] 0-socket.gfchangelog: >>> Configured transport.tcp-user-timeout=42 >>> [2021-03-11 19:15:30.613142] D >>> [socket.c:4543:socket_init] 0-socket.gfchangelog: >>> Reconfigured transport.keepalivecnt=9 >>> [2021-03-11 19:15:30.613208] I >>> [socket.c:929:__socket_server_bind] >>> 0-socket.gfchangelog: closing (AF_UNIX) reuse check >>> socket 22 >>> [2021-03-11 19:15:30.613545] D >>> [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: >>> New program registered: LIBGFCHANGELOG REBORP, Num: >>> 1886350951, Ver: 1, Port: 0 >>> [2021-03-11 19:15:30.613567] D >>> [rpc-clnt.c:1020:rpc_clnt_connection_init] >>> 0-gfchangelog: defaulting frame-timeout to 30mins >>> [2021-03-11 19:15:30.613574] D >>> [rpc-clnt.c:1032:rpc_clnt_connection_init] >>> 0-gfchangelog: disable ping-timeout >>> [2021-03-11 19:15:30.613582] D >>> [rpc-transport.c:278:rpc_transport_load] >>> 0-rpc-transport: attempt to load file >>> /usr/lib64/glusterfs/8.3/rpc-transport/socket.so >>> [2021-03-11 19:15:30.613637] D >>> [socket.c:4485:socket_init] 0-gfchangelog: disabling nodelay >>> [2021-03-11 19:15:30.613654] D >>> [socket.c:4523:socket_init] 0-gfchangelog: Configured >>> transport.tcp-user-timeout=42 >>> [2021-03-11 19:15:32.614273] D >>> [rpc-clnt-ping.c:298:rpc_clnt_start_ping] 0-gfchangelog: >>> ping timeout is 0, returning >>> [2021-03-11 19:15:32.643628] I [MSGID: 132035] >>> [gf-history-changelog.c:837:gf_history_changelog] >>> 0-gfchangelog: Requesting historical changelogs >>> [{start=1614666552}, {end=1615490132}] >>> [2021-03-11 19:15:32.643716] I [MSGID: 132019] >>> [gf-history-changelog.c:755:gf_changelog_extract_min_max] >>> 0-gfchangelog: changelogs min max [{min=1597342860}, >>> {max=1615490123}, {total_changelogs=1264296}] >>> [2021-03-11 19:15:32.700397] E [MSGID: 132009] >>> [gf-history-changelog.c:941:gf_history_changelog] >>> 0-gfchangelog: wrong result [{for=end}, >>> {start=1615490123}, {idx=1264295}] >>> [2021-03-11 19:15:46.832322] I [MSGID: 132028] >>> [gf-changelog.c:577:gf_changelog_register_generic] >>> 0-gfchangelog: Registering brick >>> [{brick=/data/storage_b/storage}, {notify_filter=1}] >>> [2021-03-11 19:15:46.832394] I [MSGID: 101190] >>> [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: >>> Started thread with index [{index=0}] >>> [2021-03-11 19:15:46.832465] I [MSGID: 101190] >>> [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: >>> Started thread with index [{index=1}] >>> [2021-03-11 19:15:46.832531] I [MSGID: 101190] >>> [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: >>> Started thread with index [{index=2}] >>> [2021-03-11 19:15:46.833086] I [MSGID: 101190] >>> [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: >>> Started thread with index [{index=3}] >>> [2021-03-11 19:15:46.833648] D >>> [rpcsvc.c:2831:rpcsvc_init] 0-rpc-service: RPC service >>> inited. >>> [2021-03-11 19:15:46.833817] D >>> [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: >>> New program registered: GF-DUMP, Num: 123451501, Ver: 1, >>> Port: 0 >>> [2021-03-11 19:15:46.833835] D >>> [rpc-transport.c:278:rpc_transport_load] >>> 0-rpc-transport: attempt to load file >>> /usr/lib64/glusterfs/8.3/rpc-transport/socket.so >>> [2021-03-11 19:15:46.834368] D >>> [socket.c:4485:socket_init] 0-socket.gfchangelog: >>> disabling nodelay >>> [2021-03-11 19:15:46.834380] D >>> [socket.c:4523:socket_init] 0-socket.gfchangelog: >>> Configured transport.tcp-user-timeout=42 >>> [2021-03-11 19:15:46.834386] D >>> [socket.c:4543:socket_init] 0-socket.gfchangelog: >>> Reconfigured transport.keepalivecnt=9 >>> [2021-03-11 19:15:46.834441] I >>> [socket.c:929:__socket_server_bind] >>> 0-socket.gfchangelog: closing (AF_UNIX) reuse check >>> socket 23 >>> [2021-03-11 19:15:46.834768] D >>> [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: >>> New program registered: LIBGFCHANGELOG REBORP, Num: >>> 1886350951, Ver: 1, Port: 0 >>> [2021-03-11 19:15:46.834789] D >>> [rpc-clnt.c:1020:rpc_clnt_connection_init] >>> 0-gfchangelog: defaulting frame-timeout to 30mins >>> [2021-03-11 19:15:46.834795] D >>> [rpc-clnt.c:1032:rpc_clnt_connection_init] >>> 0-gfchangelog: disable ping-timeout >>> [2021-03-11 19:15:46.834802] D >>> [rpc-transport.c:278:rpc_transport_load] >>> 0-rpc-transport: attempt to load file >>> /usr/lib64/glusterfs/8.3/rpc-transport/socket.so >>> [2021-03-11 19:15:46.834845] D >>> [socket.c:4485:socket_init] 0-gfchangelog: disabling nodelay >>> [2021-03-11 19:15:46.834853] D >>> [socket.c:4523:socket_init] 0-gfchangelog: Configured >>> transport.tcp-user-timeout=42 >>> [2021-03-11 19:15:47.618476] D >>> [rpc-clnt-ping.c:298:rpc_clnt_start_ping] 0-gfchangelog: >>> ping timeout is 0, returning >>> >>> >>> gsyncd logged a lot but I'm not sure if it's helpful: >>> >>> [2021-03-11 19:15:00.41898] D >>> [gsyncd(config-get):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:21.551302] D >>> [gsyncd(config-get):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:21.631470] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:21.718386] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:21.804991] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:26.203999] D >>> [gsyncd(config-get):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:26.284775] D >>> [gsyncd(config-get):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:26.573355] D >>> [gsyncd(config-get):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:26.653752] D >>> [gsyncd(monitor):303:main] <top>: Using session config >>> file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:26.756994] D >>> [monitor(monitor):304:distribute] <top>: master bricks: >>> [{'host': '10.0.231.91', 'uuid': >>> 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': >>> '/data/storage_a/storage'}, {'host': '10.0.2 >>> 31.92', 'uuid': 'ebbd7b74-3cf8-4752-a71c-b0f0ca86c97d', >>> 'dir': '/data/storage_b/storage'}, {'host': >>> '10.0.231.93', 'uuid': >>> '8b28b331-3780-46bc-9da3-fb27de4ab57b', 'dir': >>> '/data/storage_c/storage'}, {'host': '10. >>> 0.231.92', 'uuid': >>> 'ebbd7b74-3cf8-4752-a71c-b0f0ca86c97d', 'dir': >>> '/data/storage_a/storage'}, {'host': '10.0.231.93', >>> 'uuid': '8b28b331-3780-46bc-9da3-fb27de4ab57b', 'dir': >>> '/data/storage_b/storage'}, {'host': ' >>> 10.0.231.91', 'uuid': >>> 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': >>> '/data/storage_c/storage'}, {'host': '10.0.231.93', >>> 'uuid': '8b28b331-3780-46bc-9da3-fb27de4ab57b', 'dir': >>> '/data/storage_a/storage'}, {'host' >>> : '10.0.231.91', 'uuid': >>> 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': >>> '/data/storage_b/storage'}, {'host': '10.0.231.92', >>> 'uuid': 'ebbd7b74-3cf8-4752-a71c-b0f0ca86c97d', 'dir': >>> '/data/storage_c/storage'}] >>> [2021-03-11 19:15:26.757252] D >>> [monitor(monitor):314:distribute] <top>: slave SSH >>> gateway: geoaccount at 10.0.231.81 >>> <mailto:geoaccount at 10.0.231.81> >>> [2021-03-11 19:15:27.416235] D >>> [monitor(monitor):334:distribute] <top>: slave bricks: >>> [{'host': '10.0.231.81', 'uuid': >>> 'b88dea4f-31ec-416a-9110-3ccdc3910acd', 'dir': >>> '/data/brick'}, {'host': '10.0.231.82', 'uuid >>> ': 'be50a8de-3934-4fee-a80d-8e2e99017902', 'dir': >>> '/data/brick'}] >>> [2021-03-11 19:15:27.416825] D >>> [syncdutils(monitor):932:is_hot] Volinfo: brickpath: >>> '10.0.231.91:/data/storage_a/storage' >>> [2021-03-11 19:15:27.417273] D >>> [syncdutils(monitor):932:is_hot] Volinfo: brickpath: >>> '10.0.231.91:/data/storage_c/storage' >>> [2021-03-11 19:15:27.417515] D >>> [syncdutils(monitor):932:is_hot] Volinfo: brickpath: >>> '10.0.231.91:/data/storage_b/storage' >>> [2021-03-11 19:15:27.417763] D >>> [monitor(monitor):348:distribute] <top>: worker specs: >>> [({'host': '10.0.231.91', 'uuid': >>> 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': >>> '/data/storage_a/storage'}, ('geoaccount at 10. >>> 0.231.81', 'b88dea4f-31ec-416a-9110-3ccdc3910acd'), '1', >>> False), ({'host': '10.0.231.91', 'uuid': >>> 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': >>> '/data/storage_c/storage'}, ('geoaccount at 10.0.231.82 >>> <mailto:geoaccount at 10.0.231.82>', 'be50a8de-3 >>> 934-4fee-a80d-8e2e99017902'), '2', False), ({'host': >>> '10.0.231.91', 'uuid': >>> 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': >>> '/data/storage_b/storage'}, ('geoaccount at 10.0.231.82 >>> <mailto:geoaccount at 10.0.231.82>', >>> 'be50a8de-3934-4fee-a80d-8e2e9901 >>> 7902'), '3', False)] >>> [2021-03-11 19:15:27.425009] I >>> [monitor(monitor):160:monitor] Monitor: starting gsyncd >>> worker [{brick=/data/storage_c/storage}, >>> {slave_node=10.0.231.82}] >>> [2021-03-11 19:15:27.426764] I >>> [monitor(monitor):160:monitor] Monitor: starting gsyncd >>> worker [{brick=/data/storage_b/storage}, >>> {slave_node=10.0.231.82}] >>> [2021-03-11 19:15:27.429208] I >>> [monitor(monitor):160:monitor] Monitor: starting gsyncd >>> worker [{brick=/data/storage_a/storage}, >>> {slave_node=10.0.231.81}] >>> [2021-03-11 19:15:27.432280] D >>> [monitor(monitor):195:monitor] Monitor: Worker would >>> mount volume privately >>> [2021-03-11 19:15:27.434195] D >>> [monitor(monitor):195:monitor] Monitor: Worker would >>> mount volume privately >>> [2021-03-11 19:15:27.436584] D >>> [monitor(monitor):195:monitor] Monitor: Worker would >>> mount volume privately >>> [2021-03-11 19:15:27.478806] D [gsyncd(worker >>> /data/storage_c/storage):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:27.478852] D [gsyncd(worker >>> /data/storage_b/storage):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:27.480104] D [gsyncd(worker >>> /data/storage_a/storage):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:27.500456] I [resource(worker >>> /data/storage_c/storage):1387:connect_remote] SSH: >>> Initializing SSH connection between master and slave... >>> [2021-03-11 19:15:27.501375] I [resource(worker >>> /data/storage_b/storage):1387:connect_remote] SSH: >>> Initializing SSH connection between master and slave... >>> [2021-03-11 19:15:27.502003] I [resource(worker >>> /data/storage_a/storage):1387:connect_remote] SSH: >>> Initializing SSH connection between master and slave... >>> [2021-03-11 19:15:27.525511] D [repce(worker >>> /data/storage_a/storage):195:push] RepceClient: call >>> 192117:140572692309824:1615490127.53 __repce_version__() ... >>> [2021-03-11 19:15:27.525582] D [repce(worker >>> /data/storage_b/storage):195:push] RepceClient: call >>> 192115:139891296405312:1615490127.53 __repce_version__() ... >>> [2021-03-11 19:15:27.526089] D [repce(worker >>> /data/storage_c/storage):195:push] RepceClient: call >>> 192114:140388828780352:1615490127.53 __repce_version__() ... >>> [2021-03-11 19:15:29.435985] D [repce(worker >>> /data/storage_a/storage):215:__call__] RepceClient: call >>> 192117:140572692309824:1615490127.53 __repce_version__ >>> -> 1.0 >>> [2021-03-11 19:15:29.436213] D [repce(worker >>> /data/storage_a/storage):195:push] RepceClient: call >>> 192117:140572692309824:1615490129.44 version() ... >>> [2021-03-11 19:15:29.437136] D [repce(worker >>> /data/storage_a/storage):215:__call__] RepceClient: call >>> 192117:140572692309824:1615490129.44 version -> 1.0 >>> [2021-03-11 19:15:29.437268] D [repce(worker >>> /data/storage_a/storage):195:push] RepceClient: call >>> 192117:140572692309824:1615490129.44 pid() ... >>> [2021-03-11 19:15:29.437915] D [repce(worker >>> /data/storage_a/storage):215:__call__] RepceClient: call >>> 192117:140572692309824:1615490129.44 pid -> 157321 >>> [2021-03-11 19:15:29.438004] I [resource(worker >>> /data/storage_a/storage):1436:connect_remote] SSH: SSH >>> connection between master and slave established. >>> [{duration=1.9359}] >>> [2021-03-11 19:15:29.438072] I [resource(worker >>> /data/storage_a/storage):1116:connect] GLUSTER: Mounting >>> gluster volume locally... >>> [2021-03-11 19:15:29.494538] D [repce(worker >>> /data/storage_b/storage):215:__call__] RepceClient: call >>> 192115:139891296405312:1615490127.53 __repce_version__ >>> -> 1.0 >>> [2021-03-11 19:15:29.494748] D [repce(worker >>> /data/storage_b/storage):195:push] RepceClient: call >>> 192115:139891296405312:1615490129.49 version() ... >>> [2021-03-11 19:15:29.495290] D [repce(worker >>> /data/storage_b/storage):215:__call__] RepceClient: call >>> 192115:139891296405312:1615490129.49 version -> 1.0 >>> [2021-03-11 19:15:29.495400] D [repce(worker >>> /data/storage_b/storage):195:push] RepceClient: call >>> 192115:139891296405312:1615490129.5 pid() ... >>> [2021-03-11 19:15:29.495872] D [repce(worker >>> /data/storage_b/storage):215:__call__] RepceClient: call >>> 192115:139891296405312:1615490129.5 pid -> 88110 >>> [2021-03-11 19:15:29.495960] I [resource(worker >>> /data/storage_b/storage):1436:connect_remote] SSH: SSH >>> connection between master and slave established. >>> [{duration=1.9944}] >>> [2021-03-11 19:15:29.496028] I [resource(worker >>> /data/storage_b/storage):1116:connect] GLUSTER: Mounting >>> gluster volume locally... >>> [2021-03-11 19:15:29.501255] D [repce(worker >>> /data/storage_c/storage):215:__call__] RepceClient: call >>> 192114:140388828780352:1615490127.53 __repce_version__ >>> -> 1.0 >>> [2021-03-11 19:15:29.501454] D [repce(worker >>> /data/storage_c/storage):195:push] RepceClient: call >>> 192114:140388828780352:1615490129.5 version() ... >>> [2021-03-11 19:15:29.502258] D [repce(worker >>> /data/storage_c/storage):215:__call__] RepceClient: call >>> 192114:140388828780352:1615490129.5 version -> 1.0 >>> [2021-03-11 19:15:29.502444] D [repce(worker >>> /data/storage_c/storage):195:push] RepceClient: call >>> 192114:140388828780352:1615490129.5 pid() ... >>> [2021-03-11 19:15:29.503140] D [repce(worker >>> /data/storage_c/storage):215:__call__] RepceClient: call >>> 192114:140388828780352:1615490129.5 pid -> 88111 >>> [2021-03-11 19:15:29.503232] I [resource(worker >>> /data/storage_c/storage):1436:connect_remote] SSH: SSH >>> connection between master and slave established. >>> [{duration=2.0026}] >>> [2021-03-11 19:15:29.503302] I [resource(worker >>> /data/storage_c/storage):1116:connect] GLUSTER: Mounting >>> gluster volume locally... >>> [2021-03-11 19:15:29.533899] D [resource(worker >>> /data/storage_a/storage):880:inhibit] DirectMounter: >>> auxiliary glusterfs mount in place >>> [2021-03-11 19:15:29.595736] D [resource(worker >>> /data/storage_b/storage):880:inhibit] DirectMounter: >>> auxiliary glusterfs mount in place >>> [2021-03-11 19:15:29.601110] D [resource(worker >>> /data/storage_c/storage):880:inhibit] DirectMounter: >>> auxiliary glusterfs mount in place >>> [2021-03-11 19:15:30.541542] D [resource(worker >>> /data/storage_a/storage):964:inhibit] DirectMounter: >>> auxiliary glusterfs mount prepared >>> [2021-03-11 19:15:30.541816] I [resource(worker >>> /data/storage_a/storage):1139:connect] GLUSTER: Mounted >>> gluster volume [{duration=1.1037}] >>> [2021-03-11 19:15:30.541887] I [subcmds(worker >>> /data/storage_a/storage):84:subcmd_worker] <top>: Worker >>> spawn successful. Acknowledging back to monitor >>> [2021-03-11 19:15:30.542042] D [master(worker >>> /data/storage_a/storage):105:gmaster_builder] <top>: >>> setting up change detection mode [{mode=xsync}] >>> [2021-03-11 19:15:30.542125] D >>> [monitor(monitor):222:monitor] Monitor: >>> worker(/data/storage_a/storage) connected >>> [2021-03-11 19:15:30.543323] D [master(worker >>> /data/storage_a/storage):105:gmaster_builder] <top>: >>> setting up change detection mode [{mode=changelog}] >>> [2021-03-11 19:15:30.544460] D [master(worker >>> /data/storage_a/storage):105:gmaster_builder] <top>: >>> setting up change detection mode [{mode=changeloghistory}] >>> [2021-03-11 19:15:30.552103] D [master(worker >>> /data/storage_a/storage):778:setup_working_dir] >>> _GMaster: changelog working dir >>> /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage >>> [2021-03-11 19:15:30.602937] D [resource(worker >>> /data/storage_b/storage):964:inhibit] DirectMounter: >>> auxiliary glusterfs mount prepared >>> [2021-03-11 19:15:30.603117] I [resource(worker >>> /data/storage_b/storage):1139:connect] GLUSTER: Mounted >>> gluster volume [{duration=1.1070}] >>> [2021-03-11 19:15:30.603197] I [subcmds(worker >>> /data/storage_b/storage):84:subcmd_worker] <top>: Worker >>> spawn successful. Acknowledging back to monitor >>> [2021-03-11 19:15:30.603353] D [master(worker >>> /data/storage_b/storage):105:gmaster_builder] <top>: >>> setting up change detection mode [{mode=xsync}] >>> [2021-03-11 19:15:30.603338] D >>> [monitor(monitor):222:monitor] Monitor: >>> worker(/data/storage_b/storage) connected >>> [2021-03-11 19:15:30.604620] D [master(worker >>> /data/storage_b/storage):105:gmaster_builder] <top>: >>> setting up change detection mode [{mode=changelog}] >>> [2021-03-11 19:15:30.605600] D [master(worker >>> /data/storage_b/storage):105:gmaster_builder] <top>: >>> setting up change detection mode [{mode=changeloghistory}] >>> [2021-03-11 19:15:30.608365] D [resource(worker >>> /data/storage_c/storage):964:inhibit] DirectMounter: >>> auxiliary glusterfs mount prepared >>> [2021-03-11 19:15:30.608534] I [resource(worker >>> /data/storage_c/storage):1139:connect] GLUSTER: Mounted >>> gluster volume [{duration=1.1052}] >>> [2021-03-11 19:15:30.608612] I [subcmds(worker >>> /data/storage_c/storage):84:subcmd_worker] <top>: Worker >>> spawn successful. Acknowledging back to monitor >>> [2021-03-11 19:15:30.608762] D [master(worker >>> /data/storage_c/storage):105:gmaster_builder] <top>: >>> setting up change detection mode [{mode=xsync}] >>> [2021-03-11 19:15:30.608779] D >>> [monitor(monitor):222:monitor] Monitor: >>> worker(/data/storage_c/storage) connected >>> [2021-03-11 19:15:30.610033] D [master(worker >>> /data/storage_c/storage):105:gmaster_builder] <top>: >>> setting up change detection mode [{mode=changelog}] >>> [2021-03-11 19:15:30.610637] D [master(worker >>> /data/storage_b/storage):778:setup_working_dir] >>> _GMaster: changelog working dir >>> /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage >>> [2021-03-11 19:15:30.610970] D [master(worker >>> /data/storage_c/storage):105:gmaster_builder] <top>: >>> setting up change detection mode [{mode=changeloghistory}] >>> [2021-03-11 19:15:30.616197] D [master(worker >>> /data/storage_c/storage):778:setup_working_dir] >>> _GMaster: changelog working dir >>> /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage >>> [2021-03-11 19:15:31.371265] D >>> [gsyncd(config-get):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:31.451000] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:31.537257] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:31.623800] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:32.555840] D [master(worker >>> /data/storage_a/storage):778:setup_working_dir] >>> _GMaster: changelog working dir >>> /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage >>> [2021-03-11 19:15:32.556051] D [master(worker >>> /data/storage_a/storage):778:setup_working_dir] >>> _GMaster: changelog working dir >>> /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage >>> [2021-03-11 19:15:32.556122] D [master(worker >>> /data/storage_a/storage):778:setup_working_dir] >>> _GMaster: changelog working dir >>> /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage >>> [2021-03-11 19:15:32.556179] I [master(worker >>> /data/storage_a/storage):1645:register] _GMaster: >>> Working dir >>> [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage}] >>> [2021-03-11 19:15:32.556359] I [resource(worker >>> /data/storage_a/storage):1292:service_loop] GLUSTER: >>> Register time [{time=1615490132}] >>> [2021-03-11 19:15:32.556823] D [repce(worker >>> /data/storage_a/storage):195:push] RepceClient: call >>> 192117:140570487928576:1615490132.56 keep_alive(None,) ... >>> [2021-03-11 19:15:32.558429] D [repce(worker >>> /data/storage_a/storage):215:__call__] RepceClient: call >>> 192117:140570487928576:1615490132.56 keep_alive -> 1 >>> [2021-03-11 19:15:32.558974] D [master(worker >>> /data/storage_a/storage):540:crawlwrap] _GMaster: >>> primary master with volume id >>> cf94a8f2-324b-40b3-bf72-c3766100ea99 ... >>> [2021-03-11 19:15:32.567478] I [gsyncdstatus(worker >>> /data/storage_a/storage):281:set_active] GeorepStatus: >>> Worker Status Change [{status=Active}] >>> [2021-03-11 19:15:32.571824] I [gsyncdstatus(worker >>> /data/storage_a/storage):253:set_worker_crawl_status] >>> GeorepStatus: Crawl Status Change [{status=History Crawl}] >>> [2021-03-11 19:15:32.572052] I [master(worker >>> /data/storage_a/storage):1559:crawl] _GMaster: starting >>> history crawl [{turns=1}, {stime=(1614666553, 0)}, >>> {entry_stime=(1614664115, 0)}, {etime=1615490132}] >>> [2021-03-11 19:15:32.614506] D [master(worker >>> /data/storage_b/storage):778:setup_working_dir] >>> _GMaster: changelog working dir >>> /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage >>> [2021-03-11 19:15:32.614701] D [master(worker >>> /data/storage_b/storage):778:setup_working_dir] >>> _GMaster: changelog working dir >>> /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage >>> [2021-03-11 19:15:32.614788] D [master(worker >>> /data/storage_b/storage):778:setup_working_dir] >>> _GMaster: changelog working dir >>> /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage >>> [2021-03-11 19:15:32.614845] I [master(worker >>> /data/storage_b/storage):1645:register] _GMaster: >>> Working dir >>> [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage}] >>> [2021-03-11 19:15:32.615000] I [resource(worker >>> /data/storage_b/storage):1292:service_loop] GLUSTER: >>> Register time [{time=1615490132}] >>> [2021-03-11 19:15:32.615586] D [repce(worker >>> /data/storage_b/storage):195:push] RepceClient: call >>> 192115:139889215526656:1615490132.62 keep_alive(None,) ... >>> [2021-03-11 19:15:32.617373] D [repce(worker >>> /data/storage_b/storage):215:__call__] RepceClient: call >>> 192115:139889215526656:1615490132.62 keep_alive -> 1 >>> [2021-03-11 19:15:32.618144] D [master(worker >>> /data/storage_b/storage):540:crawlwrap] _GMaster: >>> primary master with volume id >>> cf94a8f2-324b-40b3-bf72-c3766100ea99 ... >>> [2021-03-11 19:15:32.619323] D [master(worker >>> /data/storage_c/storage):778:setup_working_dir] >>> _GMaster: changelog working dir >>> /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage >>> [2021-03-11 19:15:32.619491] D [master(worker >>> /data/storage_c/storage):778:setup_working_dir] >>> _GMaster: changelog working dir >>> /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage >>> [2021-03-11 19:15:32.619739] D [master(worker >>> /data/storage_c/storage):778:setup_working_dir] >>> _GMaster: changelog working dir >>> /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage >>> [2021-03-11 19:15:32.619863] I [master(worker >>> /data/storage_c/storage):1645:register] _GMaster: >>> Working dir >>> [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage}] >>> [2021-03-11 19:15:32.620040] I [resource(worker >>> /data/storage_c/storage):1292:service_loop] GLUSTER: >>> Register time [{time=1615490132}] >>> [2021-03-11 19:15:32.620599] D [repce(worker >>> /data/storage_c/storage):195:push] RepceClient: call >>> 192114:140386886469376:1615490132.62 keep_alive(None,) ... >>> [2021-03-11 19:15:32.621397] E [resource(worker >>> /data/storage_a/storage):1312:service_loop] GLUSTER: >>> Changelog History Crawl failed [{error=[Errno 0] Success}] >>> [2021-03-11 19:15:32.622035] D [repce(worker >>> /data/storage_c/storage):215:__call__] RepceClient: call >>> 192114:140386886469376:1615490132.62 keep_alive -> 1 >>> [2021-03-11 19:15:32.622701] D [master(worker >>> /data/storage_c/storage):540:crawlwrap] _GMaster: >>> primary master with volume id >>> cf94a8f2-324b-40b3-bf72-c3766100ea99 ... >>> [2021-03-11 19:15:32.627031] I [gsyncdstatus(worker >>> /data/storage_b/storage):281:set_active] GeorepStatus: >>> Worker Status Change [{status=Active}] >>> [2021-03-11 19:15:32.643184] I [gsyncdstatus(worker >>> /data/storage_b/storage):253:set_worker_crawl_status] >>> GeorepStatus: Crawl Status Change [{status=History Crawl}] >>> [2021-03-11 19:15:32.643528] I [master(worker >>> /data/storage_b/storage):1559:crawl] _GMaster: starting >>> history crawl [{turns=1}, {stime=(1614666552, 0)}, >>> {entry_stime=(1614664113, 0)}, {etime=1615490132}] >>> [2021-03-11 19:15:32.645148] I [gsyncdstatus(worker >>> /data/storage_c/storage):281:set_active] GeorepStatus: >>> Worker Status Change [{status=Active}] >>> [2021-03-11 19:15:32.649631] I [gsyncdstatus(worker >>> /data/storage_c/storage):253:set_worker_crawl_status] >>> GeorepStatus: Crawl Status Change [{status=History Crawl}] >>> [2021-03-11 19:15:32.649882] I [master(worker >>> /data/storage_c/storage):1559:crawl] _GMaster: starting >>> history crawl [{turns=1}, {stime=(1614666552, 0)}, >>> {entry_stime=(1614664108, 0)}, {etime=1615490132}] >>> [2021-03-11 19:15:32.650907] E [resource(worker >>> /data/storage_c/storage):1312:service_loop] GLUSTER: >>> Changelog History Crawl failed [{error=[Errno 0] Success}] >>> [2021-03-11 19:15:32.700489] E [resource(worker >>> /data/storage_b/storage):1312:service_loop] GLUSTER: >>> Changelog History Crawl failed [{error=[Errno 0] Success}] >>> [2021-03-11 19:15:33.545886] I >>> [monitor(monitor):228:monitor] Monitor: worker died in >>> startup phase [{brick=/data/storage_a/storage}] >>> [2021-03-11 19:15:33.550487] I >>> [gsyncdstatus(monitor):248:set_worker_status] >>> GeorepStatus: Worker Status Change [{status=Faulty}] >>> [2021-03-11 19:15:33.606991] I >>> [monitor(monitor):228:monitor] Monitor: worker died in >>> startup phase [{brick=/data/storage_b/storage}] >>> [2021-03-11 19:15:33.611573] I >>> [gsyncdstatus(monitor):248:set_worker_status] >>> GeorepStatus: Worker Status Change [{status=Faulty}] >>> [2021-03-11 19:15:33.612337] I >>> [monitor(monitor):228:monitor] Monitor: worker died in >>> startup phase [{brick=/data/storage_c/storage}] >>> [2021-03-11 19:15:33.615777] I >>> [gsyncdstatus(monitor):248:set_worker_status] >>> GeorepStatus: Worker Status Change [{status=Faulty}] >>> [2021-03-11 19:15:34.684247] D >>> [gsyncd(config-get):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:34.764971] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:34.851174] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:34.937166] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:36.994502] D >>> [gsyncd(config-get):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:37.73805] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:37.159288] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:37.244153] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:38.916510] D >>> [gsyncd(config-get):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:38.997649] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:39.84816] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:39.172045] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:40.896359] D >>> [gsyncd(config-get):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:40.976135] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:41.62052] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:41.147902] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:42.791997] D >>> [gsyncd(config-get):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:42.871239] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:42.956609] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:43.42473] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:43.566190] I >>> [gsyncdstatus(monitor):248:set_worker_status] >>> GeorepStatus: Worker Status Change >>> [{status=Initializing...}] >>> [2021-03-11 19:15:43.566400] I >>> [monitor(monitor):160:monitor] Monitor: starting gsyncd >>> worker [{brick=/data/storage_a/storage}, >>> {slave_node=10.0.231.81}] >>> [2021-03-11 19:15:43.572240] D >>> [monitor(monitor):195:monitor] Monitor: Worker would >>> mount volume privately >>> [2021-03-11 19:15:43.612744] D [gsyncd(worker >>> /data/storage_a/storage):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:43.625689] I >>> [gsyncdstatus(monitor):248:set_worker_status] >>> GeorepStatus: Worker Status Change >>> [{status=Initializing...}] >>> [2021-03-11 19:15:43.626060] I >>> [monitor(monitor):160:monitor] Monitor: starting gsyncd >>> worker [{brick=/data/storage_b/storage}, >>> {slave_node=10.0.231.82}] >>> [2021-03-11 19:15:43.632287] I >>> [gsyncdstatus(monitor):248:set_worker_status] >>> GeorepStatus: Worker Status Change >>> [{status=Initializing...}] >>> [2021-03-11 19:15:43.632137] D >>> [monitor(monitor):195:monitor] Monitor: Worker would >>> mount volume privately >>> [2021-03-11 19:15:43.632508] I >>> [monitor(monitor):160:monitor] Monitor: starting gsyncd >>> worker [{brick=/data/storage_c/storage}, >>> {slave_node=10.0.231.82}] >>> [2021-03-11 19:15:43.635565] I [resource(worker >>> /data/storage_a/storage):1387:connect_remote] SSH: >>> Initializing SSH connection between master and slave... >>> [2021-03-11 19:15:43.637835] D >>> [monitor(monitor):195:monitor] Monitor: Worker would >>> mount volume privately >>> [2021-03-11 19:15:43.661304] D [repce(worker >>> /data/storage_a/storage):195:push] RepceClient: call >>> 192535:140367272073024:1615490143.66 __repce_version__() ... >>> [2021-03-11 19:15:43.674499] D [gsyncd(worker >>> /data/storage_b/storage):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:43.680706] D [gsyncd(worker >>> /data/storage_c/storage):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:43.693773] I [resource(worker >>> /data/storage_b/storage):1387:connect_remote] SSH: >>> Initializing SSH connection between master and slave... >>> [2021-03-11 19:15:43.700957] I [resource(worker >>> /data/storage_c/storage):1387:connect_remote] SSH: >>> Initializing SSH connection between master and slave... >>> [2021-03-11 19:15:43.717686] D [repce(worker >>> /data/storage_b/storage):195:push] RepceClient: call >>> 192539:139907321804608:1615490143.72 __repce_version__() ... >>> [2021-03-11 19:15:43.725369] D [repce(worker >>> /data/storage_c/storage):195:push] RepceClient: call >>> 192541:140653101852480:1615490143.73 __repce_version__() ... >>> [2021-03-11 19:15:44.289117] D >>> [gsyncd(config-get):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:44.375693] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:44.472251] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:44.558429] D [gsyncd(status):303:main] >>> <top>: Using session config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:45.619694] D [repce(worker >>> /data/storage_a/storage):215:__call__] RepceClient: call >>> 192535:140367272073024:1615490143.66 __repce_version__ >>> -> 1.0 >>> [2021-03-11 19:15:45.619930] D [repce(worker >>> /data/storage_a/storage):195:push] RepceClient: call >>> 192535:140367272073024:1615490145.62 version() ... >>> [2021-03-11 19:15:45.621191] D [repce(worker >>> /data/storage_a/storage):215:__call__] RepceClient: call >>> 192535:140367272073024:1615490145.62 version -> 1.0 >>> [2021-03-11 19:15:45.621332] D [repce(worker >>> /data/storage_a/storage):195:push] RepceClient: call >>> 192535:140367272073024:1615490145.62 pid() ... >>> [2021-03-11 19:15:45.621859] D [repce(worker >>> /data/storage_a/storage):215:__call__] RepceClient: call >>> 192535:140367272073024:1615490145.62 pid -> 158229 >>> [2021-03-11 19:15:45.621939] I [resource(worker >>> /data/storage_a/storage):1436:connect_remote] SSH: SSH >>> connection between master and slave established. >>> [{duration=1.9862}] >>> [2021-03-11 19:15:45.622000] I [resource(worker >>> /data/storage_a/storage):1116:connect] GLUSTER: Mounting >>> gluster volume locally... >>> [2021-03-11 19:15:45.714468] D [resource(worker >>> /data/storage_a/storage):880:inhibit] DirectMounter: >>> auxiliary glusterfs mount in place >>> [2021-03-11 19:15:45.718441] D [repce(worker >>> /data/storage_c/storage):215:__call__] RepceClient: call >>> 192541:140653101852480:1615490143.73 __repce_version__ >>> -> 1.0 >>> [2021-03-11 19:15:45.718643] D [repce(worker >>> /data/storage_c/storage):195:push] RepceClient: call >>> 192541:140653101852480:1615490145.72 version() ... >>> [2021-03-11 19:15:45.719492] D [repce(worker >>> /data/storage_c/storage):215:__call__] RepceClient: call >>> 192541:140653101852480:1615490145.72 version -> 1.0 >>> [2021-03-11 19:15:45.719772] D [repce(worker >>> /data/storage_c/storage):195:push] RepceClient: call >>> 192541:140653101852480:1615490145.72 pid() ... >>> [2021-03-11 19:15:45.720202] D [repce(worker >>> /data/storage_b/storage):215:__call__] RepceClient: call >>> 192539:139907321804608:1615490143.72 __repce_version__ >>> -> 1.0 >>> [2021-03-11 19:15:45.720381] D [repce(worker >>> /data/storage_b/storage):195:push] RepceClient: call >>> 192539:139907321804608:1615490145.72 version() ... >>> [2021-03-11 19:15:45.720463] D [repce(worker >>> /data/storage_c/storage):215:__call__] RepceClient: call >>> 192541:140653101852480:1615490145.72 pid -> 88921 >>> [2021-03-11 19:15:45.720694] I [resource(worker >>> /data/storage_c/storage):1436:connect_remote] SSH: SSH >>> connection between master and slave established. >>> [{duration=2.0196}] >>> [2021-03-11 19:15:45.720882] I [resource(worker >>> /data/storage_c/storage):1116:connect] GLUSTER: Mounting >>> gluster volume locally... >>> [2021-03-11 19:15:45.721146] D [repce(worker >>> /data/storage_b/storage):215:__call__] RepceClient: call >>> 192539:139907321804608:1615490145.72 version -> 1.0 >>> [2021-03-11 19:15:45.721271] D [repce(worker >>> /data/storage_b/storage):195:push] RepceClient: call >>> 192539:139907321804608:1615490145.72 pid() ... >>> [2021-03-11 19:15:45.721795] D [repce(worker >>> /data/storage_b/storage):215:__call__] RepceClient: call >>> 192539:139907321804608:1615490145.72 pid -> 88924 >>> [2021-03-11 19:15:45.721911] I [resource(worker >>> /data/storage_b/storage):1436:connect_remote] SSH: SSH >>> connection between master and slave established. >>> [{duration=2.0280}] >>> [2021-03-11 19:15:45.721993] I [resource(worker >>> /data/storage_b/storage):1116:connect] GLUSTER: Mounting >>> gluster volume locally... >>> [2021-03-11 19:15:45.816891] D [resource(worker >>> /data/storage_b/storage):880:inhibit] DirectMounter: >>> auxiliary glusterfs mount in place >>> [2021-03-11 19:15:45.816960] D [resource(worker >>> /data/storage_c/storage):880:inhibit] DirectMounter: >>> auxiliary glusterfs mount in place >>> [2021-03-11 19:15:46.721534] D [resource(worker >>> /data/storage_a/storage):964:inhibit] DirectMounter: >>> auxiliary glusterfs mount prepared >>> [2021-03-11 19:15:46.721726] I [resource(worker >>> /data/storage_a/storage):1139:connect] GLUSTER: Mounted >>> gluster volume [{duration=1.0997}] >>> [2021-03-11 19:15:46.721796] I [subcmds(worker >>> /data/storage_a/storage):84:subcmd_worker] <top>: Worker >>> spawn successful. Acknowledging back to monitor >>> [2021-03-11 19:15:46.721971] D [master(worker >>> /data/storage_a/storage):105:gmaster_builder] <top>: >>> setting up change detection mode [{mode=xsync}] >>> [2021-03-11 19:15:46.722122] D >>> [monitor(monitor):222:monitor] Monitor: >>> worker(/data/storage_a/storage) connected >>> [2021-03-11 19:15:46.723871] D [master(worker >>> /data/storage_a/storage):105:gmaster_builder] <top>: >>> setting up change detection mode [{mode=changelog}] >>> [2021-03-11 19:15:46.725100] D [master(worker >>> /data/storage_a/storage):105:gmaster_builder] <top>: >>> setting up change detection mode [{mode=changeloghistory}] >>> [2021-03-11 19:15:46.732400] D [master(worker >>> /data/storage_a/storage):778:setup_working_dir] >>> _GMaster: changelog working dir >>> /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage >>> [2021-03-11 19:15:46.823477] D [resource(worker >>> /data/storage_c/storage):964:inhibit] DirectMounter: >>> auxiliary glusterfs mount prepared >>> [2021-03-11 19:15:46.823645] I [resource(worker >>> /data/storage_c/storage):1139:connect] GLUSTER: Mounted >>> gluster volume [{duration=1.1027}] >>> [2021-03-11 19:15:46.823754] I [subcmds(worker >>> /data/storage_c/storage):84:subcmd_worker] <top>: Worker >>> spawn successful. Acknowledging back to monitor >>> [2021-03-11 19:15:46.823932] D [master(worker >>> /data/storage_c/storage):105:gmaster_builder] <top>: >>> setting up change detection mode [{mode=xsync}] >>> [2021-03-11 19:15:46.823904] D [resource(worker >>> /data/storage_b/storage):964:inhibit] DirectMounter: >>> auxiliary glusterfs mount prepared >>> [2021-03-11 19:15:46.823930] D >>> [monitor(monitor):222:monitor] Monitor: >>> worker(/data/storage_c/storage) connected >>> [2021-03-11 19:15:46.824103] I [resource(worker >>> /data/storage_b/storage):1139:connect] GLUSTER: Mounted >>> gluster volume [{duration=1.1020}] >>> [2021-03-11 19:15:46.824184] I [subcmds(worker >>> /data/storage_b/storage):84:subcmd_worker] <top>: Worker >>> spawn successful. Acknowledging back to monitor >>> [2021-03-11 19:15:46.824340] D [master(worker >>> /data/storage_b/storage):105:gmaster_builder] <top>: >>> setting up change detection mode [{mode=xsync}] >>> [2021-03-11 19:15:46.824321] D >>> [monitor(monitor):222:monitor] Monitor: >>> worker(/data/storage_b/storage) connected >>> [2021-03-11 19:15:46.825100] D [master(worker >>> /data/storage_c/storage):105:gmaster_builder] <top>: >>> setting up change detection mode [{mode=changelog}] >>> [2021-03-11 19:15:46.825414] D [master(worker >>> /data/storage_b/storage):105:gmaster_builder] <top>: >>> setting up change detection mode [{mode=changelog}] >>> [2021-03-11 19:15:46.826375] D [master(worker >>> /data/storage_b/storage):105:gmaster_builder] <top>: >>> setting up change detection mode [{mode=changeloghistory}] >>> [2021-03-11 19:15:46.826574] D [master(worker >>> /data/storage_c/storage):105:gmaster_builder] <top>: >>> setting up change detection mode [{mode=changeloghistory}] >>> [2021-03-11 19:15:46.831506] D [master(worker >>> /data/storage_b/storage):778:setup_working_dir] >>> _GMaster: changelog working dir >>> /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage >>> [2021-03-11 19:15:46.833168] D [master(worker >>> /data/storage_c/storage):778:setup_working_dir] >>> _GMaster: changelog working dir >>> /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage >>> [2021-03-11 19:15:47.275141] D >>> [gsyncd(config-get):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:47.320247] D >>> [gsyncd(config-get):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:47.570877] D >>> [gsyncd(config-get):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:47.615571] D >>> [gsyncd(config-get):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:47.620893] E [syncdutils(worker >>> /data/storage_a/storage):325:log_raise_exception] <top>: >>> connection to peer is broken >>> [2021-03-11 19:15:47.620939] E [syncdutils(worker >>> /data/storage_c/storage):325:log_raise_exception] <top>: >>> connection to peer is broken >>> [2021-03-11 19:15:47.621668] E [syncdutils(worker >>> /data/storage_a/storage):847:errlog] Popen: command >>> returned error [{cmd=ssh -oPasswordAuthentication=no >>> -oStrictHostKeyChecking=no -i >>> /var/lib/glusterd/geo-replication/secret.pem -p 22 >>> -oControlMaster=auto -S >>> /tmp/gsyncd-aux-ssh-_AyCOc/79fa3dc75e30f532b4a40bc08c2b10a1.sock >>> geoaccount at 10.0.231.81 <mailto:geoaccount at 10.0.231.81> >>> /nonexistent/gsyncd slave storage >>> geoaccount at 10.0.231.81::pcic-backup >>> <mailto:geoaccount at 10.0.231.81::pcic-backup> >>> --master-node 10.0.231.91 --master-node-id >>> afc24654-2887-41f6-a9c2-8e835de243b6 --master-brick >>> /data/storage_a/storage --local-node 10.0.231.81 >>> --local-node-id b88dea4f-31ec-416a-9110-3ccdc3910acd >>> --slave-timeout 120 --slave-log-level INFO >>> --slave-gluster-log-level INFO >>> --slave-gluster-command-dir /usr/sbin >>> --master-dist-count 3}, {error=255}] >>> [2021-03-11 19:15:47.621685] E [syncdutils(worker >>> /data/storage_c/storage):847:errlog] Popen: command >>> returned error [{cmd=ssh -oPasswordAuthentication=no >>> -oStrictHostKeyChecking=no -i >>> /var/lib/glusterd/geo-replication/secret.pem -p 22 >>> -oControlMaster=auto -S >>> /tmp/gsyncd-aux-ssh-WOgOEu/e15fc58bb13552de0710eaf018209548.sock >>> geoaccount at 10.0.231.82 <mailto:geoaccount at 10.0.231.82> >>> /nonexistent/gsyncd slave storage >>> geoaccount at 10.0.231.81::pcic-backup >>> <mailto:geoaccount at 10.0.231.81::pcic-backup> >>> --master-node 10.0.231.91 --master-node-id >>> afc24654-2887-41f6-a9c2-8e835de243b6 --master-brick >>> /data/storage_c/storage --local-node 10.0.231.82 >>> --local-node-id be50a8de-3934-4fee-a80d-8e2e99017902 >>> --slave-timeout 120 --slave-log-level INFO >>> --slave-gluster-log-level INFO >>> --slave-gluster-command-dir /usr/sbin >>> --master-dist-count 3}, {error=255}] >>> [2021-03-11 19:15:47.621776] E [syncdutils(worker >>> /data/storage_a/storage):851:logerr] Popen: ssh> Killed >>> by signal 15. >>> [2021-03-11 19:15:47.621819] E [syncdutils(worker >>> /data/storage_c/storage):851:logerr] Popen: ssh> Killed >>> by signal 15. >>> [2021-03-11 19:15:47.621850] E [syncdutils(worker >>> /data/storage_b/storage):325:log_raise_exception] <top>: >>> connection to peer is broken >>> [2021-03-11 19:15:47.622437] E [syncdutils(worker >>> /data/storage_b/storage):847:errlog] Popen: command >>> returned error [{cmd=ssh -oPasswordAuthentication=no >>> -oStrictHostKeyChecking=no -i >>> /var/lib/glusterd/geo-replication/secret.pem -p 22 >>> -oControlMaster=auto -S >>> /tmp/gsyncd-aux-ssh-Vy935W/e15fc58bb13552de0710eaf018209548.sock >>> geoaccount at 10.0.231.82 <mailto:geoaccount at 10.0.231.82> >>> /nonexistent/gsyncd slave storage >>> geoaccount at 10.0.231.81::pcic-backup >>> <mailto:geoaccount at 10.0.231.81::pcic-backup> >>> --master-node 10.0.231.91 --master-node-id >>> afc24654-2887-41f6-a9c2-8e835de243b6 --master-brick >>> /data/storage_b/storage --local-node 10.0.231.82 >>> --local-node-id be50a8de-3934-4fee-a80d-8e2e99017902 >>> --slave-timeout 120 --slave-log-level INFO >>> --slave-gluster-log-level INFO >>> --slave-gluster-command-dir /usr/sbin >>> --master-dist-count 3}, {error=255}] >>> [2021-03-11 19:15:47.622556] E [syncdutils(worker >>> /data/storage_b/storage):851:logerr] Popen: ssh> Killed >>> by signal 15. >>> [2021-03-11 19:15:47.723756] I >>> [monitor(monitor):228:monitor] Monitor: worker died in >>> startup phase [{brick=/data/storage_a/storage}] >>> [2021-03-11 19:15:47.731405] I >>> [gsyncdstatus(monitor):248:set_worker_status] >>> GeorepStatus: Worker Status Change [{status=Faulty}] >>> [2021-03-11 19:15:47.825223] I >>> [monitor(monitor):228:monitor] Monitor: worker died in >>> startup phase [{brick=/data/storage_c/storage}] >>> [2021-03-11 19:15:47.825685] I >>> [monitor(monitor):228:monitor] Monitor: worker died in >>> startup phase [{brick=/data/storage_b/storage}] >>> [2021-03-11 19:15:47.829011] I >>> [gsyncdstatus(monitor):248:set_worker_status] >>> GeorepStatus: Worker Status Change [{status=Faulty}] >>> [2021-03-11 19:15:47.830965] I >>> [gsyncdstatus(monitor):248:set_worker_status] >>> GeorepStatus: Worker Status Change [{status=Faulty}] >>> [2021-03-11 19:15:48.669634] D >>> [gsyncd(monitor-status):303:main] <top>: Using session >>> config file >>> [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}] >>> [2021-03-11 19:15:48.683784] I >>> [subcmds(monitor-status):29:subcmd_monitor_status] >>> <top>: Monitor Status Change [{status=Stopped}] >>> >>> >>> Thanks, >>> ?-Matthew >>> >>> >>> On 3/11/21 9:37 AM, Strahil Nikolov wrote: >>>> Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information. >>>> >>>> >>>> I think you have to increase the debug logs for geo-rep session. >>>> I will try to find the command necessary to increase it. >>>> >>>> >>>> Best Regards, >>>> Strahil Nikolov >>>> >>>> >>>> >>>> >>>> >>>> >>>> ? ?????????, 11 ???? 2021 ?., 00:38:41 ?. ???????+2, Matthew Benstead <matthewb at uvic.ca> <mailto:matthewb at uvic.ca> ??????: >>>> >>>> >>>> >>>> >>>> >>>> Thanks Strahil, >>>> >>>> Right - I had come across your message in early January that v8 from the CentOS Sig was missing the SELinux rules, and had put SELinux into permissive mode after the upgrade when I saw denied messages in the audit logs. >>>> >>>> [root at storage01 ~]# sestatus | egrep "^SELinux status|[mM]ode" >>>> SELinux status: enabled >>>> Current mode: permissive >>>> Mode from config file: enforcing >>>> >>>> Yes - I am using an unprivileged user for georep: >>>> >>>> [root at pcic-backup01 ~]# gluster-mountbroker status >>>> +-------------+-------------+---------------------------+--------------+--------------------------+ >>>> |???? NODE | NODE STATUS |???????? MOUNT ROOT |??? GROUP |????????? USERS | >>>> +-------------+-------------+---------------------------+--------------+--------------------------+ >>>> | 10.0.231.82 |????????? UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup) | >>>> |? localhost |????????? UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup) | >>>> +-------------+-------------+---------------------------+--------------+--------------------------+ >>>> >>>> [root at pcic-backup02 ~]# gluster-mountbroker status >>>> +-------------+-------------+---------------------------+--------------+--------------------------+ >>>> |???? NODE | NODE STATUS |???????? MOUNT ROOT |??? GROUP |????????? USERS | >>>> +-------------+-------------+---------------------------+--------------+--------------------------+ >>>> | 10.0.231.81 |????????? UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup) | >>>> |? localhost |????????? UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup) | >>>> +-------------+-------------+---------------------------+--------------+--------------------------+ >>>> >>>> Thanks, >>>> -Matthew >>>> >>>> >>>> -- >>>> Matthew Benstead >>>> System AdministratorPacific Climate Impacts ConsortiumUniversity of Victoria, UH1PO Box 1800, STN CSCVictoria, BC, V8W 2Y2Phone: +1-250-721-8432Email: matthewb at uvic.ca <mailto:matthewb at uvic.ca> >>>> >>>> >>>> On 3/10/21 2:11 PM, Strahil Nikolov wrote: >>>> >>>> >>>>> ?? >>>>> ??Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information. >>>>> >>>>> >>>>> I have tested georep on v8.3 and it was running quite well untill you involve SELINUX. >>>>> >>>>> >>>>> >>>>> Are you using SELINUX ? >>>>> >>>>> Are you using unprivileged user for the georep ? >>>>> >>>>> >>>>> >>>>> >>>>> Also, you can check https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html/administration_guide/sect-troubleshooting_geo-replication <https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html/administration_guide/sect-troubleshooting_geo-replication> . >>>>> >>>>> >>>>> >>>>> >>>>> Best Regards, >>>>> >>>>> Strahil Nikolov >>>>> >>>>> >>>>>> ?? >>>>>> ?? >>>>>> On Thu, Mar 11, 2021 at 0:03, Matthew Benstead >>>>>> >>>>>> <matthewb at uvic.ca> <mailto:matthewb at uvic.ca> wrote: >>>>>> >>>>>> >>>>>> ?? >>>>>> ?? >>>>>> Hello, >>>>>> >>>>>> I recently upgraded my Distributed-Replicate cluster from Gluster 7.9 to 8.3 on CentOS7 using the CentOS Storage SIG packages. I had geo-replication syncing properly before the upgrade, but not it is not working after. >>>>>> >>>>>> After I had upgraded both master and slave clusters I attempted to start geo-replication again, but it goes to faulty quickly: >>>>>> >>>>>> [root at storage01 ~]# gluster volume geo-replication storage geoaccount at 10.0.231.81::pcic-backup <mailto:geoaccount at 10.0.231.81::pcic-backup> start >>>>>> Starting geo-replication session between storage &??geoaccount at 10.0.231.81::pcic-backup <mailto:geoaccount at 10.0.231.81::pcic-backup> has been successful\ >>>>>> >>>>>> [root at storage01 ~]# gluster volume geo-replication status >>>>>> >>>>>> MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED >>>>>> --------------------------------------------------------------------------------------------------------------------------------------------------------------------- >>>>>> 10.0.231.91 storage /data/storage_a/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup> N/A Faulty N/A N/A >>>>>> 10.0.231.91 storage /data/storage_c/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup> N/A Faulty N/A N/A >>>>>> 10.0.231.91 storage /data/storage_b/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup> N/A Faulty N/A N/A >>>>>> 10.0.231.92 storage /data/storage_b/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup> N/A Faulty N/A N/A >>>>>> 10.0.231.92 storage /data/storage_a/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup> N/A Faulty N/A N/A >>>>>> 10.0.231.92 storage /data/storage_c/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup> N/A Faulty N/A N/A >>>>>> 10.0.231.93 storage /data/storage_c/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup> N/A Faulty N/A N/A >>>>>> 10.0.231.93 storage /data/storage_b/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup> N/A Faulty N/A N/A >>>>>> 10.0.231.93 storage /data/storage_a/storage geoaccount ssh://geoaccount at 10.0.231.81::pcic-backup <mailto:ssh://geoaccount at 10.0.231.81::pcic-backup> N/A Faulty N/A N/A >>>>>> >>>>>> [root at storage01 ~]# gluster volume geo-replication storage geoaccount at 10.0.231.81::pcic-backup <mailto:geoaccount at 10.0.231.81::pcic-backup> stop >>>>>> Stopping geo-replication session between storage &??geoaccount at 10.0.231.81::pcic-backup <mailto:geoaccount at 10.0.231.81::pcic-backup> has been successful >>>>>> >>>>>> >>>>>> I went through the gsyncd logs and see it attempts to go back through the changlogs - which would make sense - but fails: >>>>>> >>>>>> [2021-03-10 19:18:42.165807] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] >>>>>> [2021-03-10 19:18:42.166136] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_a/storage}, {slave_node=10.0.231.81}] >>>>>> [2021-03-10 19:18:42.167829] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_c/storage}, {slave_node=10.0.231.82}] >>>>>> [2021-03-10 19:18:42.172343] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] >>>>>> [2021-03-10 19:18:42.172580] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_b/storage}, {slave_node=10.0.231.82}] >>>>>> [2021-03-10 19:18:42.235574] I [resource(worker /data/storage_c/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... >>>>>> [2021-03-10 19:18:42.236613] I [resource(worker /data/storage_a/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... >>>>>> [2021-03-10 19:18:42.238614] I [resource(worker /data/storage_b/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... >>>>>> [2021-03-10 19:18:44.144856] I [resource(worker /data/storage_b/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9059}] >>>>>> [2021-03-10 19:18:44.145065] I [resource(worker /data/storage_b/storage):1116:connect] GLUSTER: Mounting gluster volume locally... >>>>>> [2021-03-10 19:18:44.162873] I [resource(worker /data/storage_a/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9259}] >>>>>> [2021-03-10 19:18:44.163412] I [resource(worker /data/storage_a/storage):1116:connect] GLUSTER: Mounting gluster volume locally... >>>>>> [2021-03-10 19:18:44.167506] I [resource(worker /data/storage_c/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9316}] >>>>>> [2021-03-10 19:18:44.167746] I [resource(worker /data/storage_c/storage):1116:connect] GLUSTER: Mounting gluster volume locally... >>>>>> [2021-03-10 19:18:45.251372] I [resource(worker /data/storage_b/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1062}] >>>>>> [2021-03-10 19:18:45.251583] I [subcmds(worker /data/storage_b/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor >>>>>> [2021-03-10 19:18:45.271950] I [resource(worker /data/storage_c/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1041}] >>>>>> [2021-03-10 19:18:45.272118] I [subcmds(worker /data/storage_c/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor >>>>>> [2021-03-10 19:18:45.275180] I [resource(worker /data/storage_a/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1116}] >>>>>> [2021-03-10 19:18:45.275361] I [subcmds(worker /data/storage_a/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor >>>>>> [2021-03-10 19:18:47.265618] I [master(worker /data/storage_b/storage):1645:register] _GMaster: Working dir [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage}] >>>>>> [2021-03-10 19:18:47.265954] I [resource(worker /data/storage_b/storage):1292:service_loop] GLUSTER: Register time [{time=1615403927}] >>>>>> [2021-03-10 19:18:47.276746] I [gsyncdstatus(worker /data/storage_b/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}] >>>>>> [2021-03-10 19:18:47.281194] I [gsyncdstatus(worker /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}] >>>>>> [2021-03-10 19:18:47.281404] I [master(worker /data/storage_b/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666552, 0)}, {entry_stime=(1614664113, 0)}, {etime=1615403927}] >>>>>> [2021-03-10 19:18:47.285340] I [master(worker /data/storage_c/storage):1645:register] _GMaster: Working dir [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage}] >>>>>> [2021-03-10 19:18:47.285579] I [resource(worker /data/storage_c/storage):1292:service_loop] GLUSTER: Register time [{time=1615403927}] >>>>>> [2021-03-10 19:18:47.287383] I [master(worker /data/storage_a/storage):1645:register] _GMaster: Working dir [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage}] >>>>>> [2021-03-10 19:18:47.287697] I [resource(worker /data/storage_a/storage):1292:service_loop] GLUSTER: Register time [{time=1615403927}] >>>>>> [2021-03-10 19:18:47.298415] I [gsyncdstatus(worker /data/storage_c/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}] >>>>>> [2021-03-10 19:18:47.301342] I [gsyncdstatus(worker /data/storage_a/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}] >>>>>> [2021-03-10 19:18:47.304183] I [gsyncdstatus(worker /data/storage_c/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}] >>>>>> [2021-03-10 19:18:47.304418] I [master(worker /data/storage_c/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666552, 0)}, {entry_stime=(1614664108, 0)}, {etime=1615403927}] >>>>>> [2021-03-10 19:18:47.305294] E [resource(worker /data/storage_c/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}] >>>>>> [2021-03-10 19:18:47.308124] I [gsyncdstatus(worker /data/storage_a/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}] >>>>>> [2021-03-10 19:18:47.308509] I [master(worker /data/storage_a/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666553, 0)}, {entry_stime=(1614664115, 0)}, {etime=1615403927}] >>>>>> [2021-03-10 19:18:47.357470] E [resource(worker /data/storage_b/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}] >>>>>> [2021-03-10 19:18:47.383949] E [resource(worker /data/storage_a/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}] >>>>>> [2021-03-10 19:18:48.255340] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_b/storage}] >>>>>> [2021-03-10 19:18:48.260052] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] >>>>>> [2021-03-10 19:18:48.275651] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_c/storage}] >>>>>> [2021-03-10 19:18:48.278064] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_a/storage}] >>>>>> [2021-03-10 19:18:48.280453] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] >>>>>> [2021-03-10 19:18:48.282274] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}] >>>>>> [2021-03-10 19:18:58.275702] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] >>>>>> [2021-03-10 19:18:58.276041] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_b/storage}, {slave_node=10.0.231.82}] >>>>>> [2021-03-10 19:18:58.296252] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] >>>>>> [2021-03-10 19:18:58.296506] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_c/storage}, {slave_node=10.0.231.82}] >>>>>> [2021-03-10 19:18:58.301290] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}] >>>>>> [2021-03-10 19:18:58.301521] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_a/storage}, {slave_node=10.0.231.81}] >>>>>> [2021-03-10 19:18:58.345817] I [resource(worker /data/storage_b/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... >>>>>> [2021-03-10 19:18:58.361268] I [resource(worker /data/storage_c/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... >>>>>> [2021-03-10 19:18:58.367985] I [resource(worker /data/storage_a/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave... >>>>>> [2021-03-10 19:18:59.115143] I [subcmds(monitor-status):29:subcmd_monitor_status] <top>: Monitor Status Change [{status=Stopped}] >>>>>> >>>>>> It seems like there is an issue selecting the changelogs - perhaps similar to this issue? https://github.com/gluster/glusterfs/issues/1766 <https://github.com/gluster/glusterfs/issues/1766> >>>>>> >>>>>> [root at storage01 storage_10.0.231.81_pcic-backup]# cat changes-data-storage_a-storage.log >>>>>> [2021-03-10 19:18:45.284764] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_a/storage}, {notify_filter=1}] >>>>>> [2021-03-10 19:18:45.285275] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}] >>>>>> [2021-03-10 19:18:45.285269] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}] >>>>>> [2021-03-10 19:18:45.286615] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 21 >>>>>> [2021-03-10 19:18:47.308607] I [MSGID: 132035] [gf-history-changelog.c:837:gf_history_changelog] 0-gfchangelog: Requesting historical changelogs [{start=1614666553}, {end=1615403927}] >>>>>> [2021-03-10 19:18:47.308659] I [MSGID: 132019] [gf-history-changelog.c:755:gf_changelog_extract_min_max] 0-gfchangelog: changelogs min max [{min=1597342860}, {max=1615403927}, {total_changelogs=1250878}] >>>>>> [2021-03-10 19:18:47.383774] E [MSGID: 132009] [gf-history-changelog.c:941:gf_history_changelog] 0-gfchangelog: wrong result [{for=end}, {start=1615403927}, {idx=1250877}] >>>>>> >>>>>> [root at storage01 storage_10.0.231.81_pcic-backup]# tail -7 changes-data-storage_b-storage.log >>>>>> [2021-03-10 19:18:45.263211] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}] >>>>>> [2021-03-10 19:18:45.263151] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_b/storage}, {notify_filter=1}] >>>>>> [2021-03-10 19:18:45.263294] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}] >>>>>> [2021-03-10 19:18:45.264598] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 23 >>>>>> [2021-03-10 19:18:47.281499] I [MSGID: 132035] [gf-history-changelog.c:837:gf_history_changelog] 0-gfchangelog: Requesting historical changelogs [{start=1614666552}, {end=1615403927}] >>>>>> [2021-03-10 19:18:47.281551] I [MSGID: 132019] [gf-history-changelog.c:755:gf_changelog_extract_min_max] 0-gfchangelog: changelogs min max [{min=1597342860}, {max=1615403927}, {total_changelogs=1258258}] >>>>>> [2021-03-10 19:18:47.357244] E [MSGID: 132009] [gf-history-changelog.c:941:gf_history_changelog] 0-gfchangelog: wrong result [{for=end}, {start=1615403927}, {idx=1258257}] >>>>>> >>>>>> Any ideas on where to debug this? I'd prefer not to have to remove and re-sync everything as there is about 240TB on the cluster... >>>>>> >>>>>> Thanks, >>>>>> -Matthew >>>>>> >>>>>> >>>>>> ________ >>>>>> >>>>>> >>>>>> >>>>>> Community Meeting Calendar: >>>>>> >>>>>> Schedule - >>>>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>>>>> Bridge: https://meet.google.com/cpu-eiue-hvk <https://meet.google.com/cpu-eiue-hvk> >>>>>> Gluster-users mailing list >>>>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users <https://lists.gluster.org/mailman/listinfo/gluster-users> >>>>>> >>>>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20210317/ce6bc65e/attachment.html>