It seems the problem is not in you but in a deprecated python package. The following link gave me a hint where to look at: https://issues.apache.org/jira/browse/SVN-4899 After replacing 'readfp' with 'read_file on both nodes: /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py /usr/libexec/glusterfs/python/syncdaemon/gsyncdconfig.py the push-pem (in my test case gluster volume geo-replication sourcevol geoaccount at glusterdest::destvol create ssh-port 2244 push-pem force) succeeded and then the rest of the commands in https://docs.redhat.com/en/documentation/red_hat_gluster_storage/3.5/html/administration_guide/sect-Preparing_to_Deploy_Geo-replication#Setting_Up_the_Environment_for_a_Secure_Geo-replication_Slave worke d without any more issues and the georep was established: root at glustersource:/mnt# gluster volume geo-replication sourcevol geoaccount at glusterdest::destvol status ? PRIMARY NODE???? PRIMARY VOL??? PRIMARY BRICK????????? SECONDARY USER??? SECONDARY????????????????????????? SECONDARY NODE??? STATUS??? CRAWL STATUS?????? LAST_SYNCED???????????????? ? ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- glustersource??? sourcevol????? /bricks/vol1/brick1??? geoaccount??????? geoaccount at glusterdest::destvol????????????????????? Active??? Changelog Crawl??? 2024-08-29 01:15:36 root at glustersource:/mnt# ls -l /mnt | wc -l 151 root at glusterdest:~# ls -l /mnt | wc -l 151 Best Regards, Strahil Nikolov ? ?????, 23 ?????? 2024 ?. ? 14:58:50 ?. ???????+3, Karl Kleinpaste <karl at kleinpaste.org> ??????: On 8/22/24 21:54, Gilberto Ferreira wrote: Perhaps you can use this tools: https://aravindavk.in/blog/gluster-georep-tools/ I am using it with great success. I'll try anything once. In fact, I tried twice. It can't set up. First, the code has port 22 baked in. I mentioned earlier that my sshd listens on a nonstandard port. Looking at the code (gluster_georep_tools/setup/cli.py, glustercli/cli/georep.py), there is provision within functions for a different port, but there is no way that I see to use that in command line args. Oddly shortsighted, or maybe I'm missing something. So second, I tweaked the code to use my nonstandard port as its default, replacing all "22" with mine, and tried again...nope. Geo-replication session will be established between j and geoacct at pms::n root at pms password is required to complete the setup. NOTE: Password will not be stored. root at pms's password: [??? OK] pms is Reachable(Port?6247) Traceback (most recent call last): ? File "/usr/local/bin/gluster-georep-setup", line 8, in <module> ??? sys.exit(main()) ???????????? ^^^^^^ ? File"/usr/local/lib/python3.12/site-packages/gluster_georep_tools/setup/cli.py", line 524, in main ??? setup_georep() ? File"/usr/local/lib/python3.12/site-packages/gluster_georep_tools/setup/cli.py", line 461, in setup_georep ??? ssh = ssh_initialize(secondary_host, args.secondary_user, passwd) ????????? ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ? File"/usr/local/lib/python3.12/site-packages/gluster_georep_tools/setup/cli.py", line 212, in ssh_initialize ??? ssh.connect(secondary_host, username=username, password=passwd) ? File "/usr/lib/python3.12/site-packages/paramiko/client.py", line 409, in connect ??? raise NoValidConnectionsError(errors) paramiko.ssh_exception.NoValidConnectionsError: [Errno None] Unable to connect to port 22 on 172.17.4.3 It seems that paramiko is either still using 22 under the hood somewhere (cf. last line error), or 22 is misinfo but it's insisting on connection using a password, but my (otherwise pretty standard) sshd configuration requires keys-only root login, no password, which is already in place among all these machines. I'm open to other suggestions. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20240828/763a578c/attachment.html>
On 8/28/24 18:20, Strahil Nikolov wrote:> It seems the problem is not in you but in a deprecated python package. > After replacing 'readfp' with 'read_file on both nodes: > /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py > /usr/libexec/glusterfs/python/syncdaemon/gsyncdconfig.pyThank you so very much. I will experiment in the morning. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20240828/febfd90f/attachment.html>
On 8/28/24 18:20, Strahil Nikolov wrote:> It seems the problem is not in you but in a deprecated python package.I appear to be very close, but I can't quite get to the finish line. I updated /usr/libexec/glusterfs/python/syncdaemon/gsyncdconfig.py on both systems, to replace readfp with read_file; you also mentioned /usr/libexec/glusterfs/python/syncdaemon/gsyncd.py, but that does not contain any instances of readfp. diff -U0 gsyncdconfig.py.~1~ gsyncdconfig.py --- gsyncdconfig.py.~1~??? 2023-11-05 19:00:00.000000000 -0500 +++ gsyncdconfig.py??? 2024-08-29 16:28:07.685753403 -0400 @@ -99 +99 @@ -??????????? cnf.readfp(f) +??????????? cnf.read_file(f) @@ -143 +143 @@ -??????????? cnf.readfp(f) +??????????? cnf.read_file(f) @@ -184 +184 @@ -??????????? conf.readfp(f) +??????????? conf.read_file(f) @@ -189 +189 @@ -??????????????? conf.readfp(f) +??????????????? conf.read_file(f) With that change, and tail'ing *.log under /var/log/glusterfs, I issued the create command and configured the port permanently: gluster volume geo-replication j geoacct at pms::n create ssh-port 6427 push-pem gluster volume geo-replication j geoacct at pms::n config ssh-port 6427 These were successful, and a status query then shows Created. Thereafter, I issued the start command, at which point ... nothing. I can run status queries forever, I can re-run start which continues to exit with SUCCESS, but georep remains in Created state, never moving to Active. I tried "start force" but that didn't help, either. I've looked for status files under /var/lib/glusterfs/geo-replication; the file monitor.status says "Created." Unsurprisingly, the "status detail" command shows several additional? "N/A" entries. /var/lib/glusterd/geo-replication/j_pms_n/gsyncd.conf contains only a [vars] section with the configured ssh port. In status output, "secondary node" shows N/A. Should it? What is left, that feeds the battle but starves the victory? --karl ------------------------------------------------ *gluster volume geo-replication j geoacct at pms::n start* [2024-08-29 22:26:22.712156 +0000] I [cli.c:788:main] 0-cli: Started running gluster with version 11.1 [2024-08-29 22:26:22.771551 +0000] I [MSGID: 101188] [event-epoll.c:643:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=0}] [2024-08-29 22:26:22.771579 +0000] I [MSGID: 101188] [event-epoll.c:643:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=1}] ==> ./glusterd.log <=[2024-08-29 22:26:22.825048 +0000] I [MSGID: 106327] [glusterd-geo-rep.c:2644:glusterd_get_statefile_name] 0-management: Using passed config template(/var/lib/glusterd/geo-replication/j_pms_n/gsyncd.conf). ==> ./cmd_history.log <=[2024-08-29 22:26:23.464111 +0000]? : volume geo-replication j geoacct at pms::n start : SUCCESS *Starting geo-replication session between j & geoacct at pms::n has been successful* ==> ./cli.log <=[2024-08-29 22:26:23.464347 +0000] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2024-08-29 22:26:23.467828 +0000] I [cli.c:788:main] 0-cli: Started running /usr/sbin/gluster with version 11.1 [2024-08-29 22:26:23.467861 +0000] I [cli.c:664:cli_rpc_init] 0-cli: Connecting to remote glusterd at localhost [2024-08-29 22:26:23.522725 +0000] I [MSGID: 101188] [event-epoll.c:643:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=0}] [2024-08-29 22:26:23.522767 +0000] I [MSGID: 101188] [event-epoll.c:643:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=1}] [2024-08-29 22:26:23.523087 +0000] I [cli-rpc-ops.c:808:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2024-08-29 22:26:23.523285 +0000] I [input.c:31:cli_batch] 0-: Exiting with: 0 *gluster volume geo-replication j geoacct at pms::n status* [2024-08-29 22:26:30.861404 +0000] I [cli.c:788:main] 0-cli: Started running gluster with version 11.1 [2024-08-29 22:26:30.914925 +0000] I [MSGID: 101188] [event-epoll.c:643:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=0}] [2024-08-29 22:26:30.915017 +0000] I [MSGID: 101188] [event-epoll.c:643:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=1}] ==> ./cmd_history.log <=[2024-08-29 22:26:31.407365 +0000]? : volume geo-replication j geoacct at pms::n status : SUCCESS PRIMARY NODE??? PRIMARY VOL??? PRIMARY BRICK??? SECONDARY USER SECONDARY?????????? SECONDARY NODE??? STATUS???? CRAWL STATUS LAST_SYNCED --------------------------------------------------------------------------------------------------------------------------------------------- major?????????? j????????????? /xx/brick/j????? geoacct geoacct at pms::n??? N/A?????????????? Created??? N/A N/A ==> ./cli.log <=[2024-08-29 22:26:31.408209 +0000] I [input.c:31:cli_batch] 0-: Exiting with: 0 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20240829/f65f009b/attachment.html>