One silly question: Did you try adding some files on the source volume after the georep was created ? Can you share the output of your version of? 'sh -x /usr/libexec/glusterfs/gverify.sh masterVol slaveUser slaveHost slaveVol sshPort logFileName' check ? Best Regards, Strahil Nikolov On Fri, Aug 30, 2024 at 23:29, Karl Kleinpaste<karl at kleinpaste.org> wrote: On 8/30/24 04:17, Strahil Nikolov wrote: Have you done the following setup on the receiving gluster volume: Yes. For completeness' sake: grep geoacct /etc/passwd /etc/group /etc/passwd:geoacct:x:5273:5273:gluster geo-replication:/var/lib/glusterd/geoacct:/bin/bash /etc/group:geoacct:x:5273: gluster-mountbroker status +-----------+-------------+---------------------------+-------------+----------------+ |??? NODE?? | NODE STATUS |???????? MOUNT ROOT??????? |??? GROUP??? |???? USERS????? | +-----------+-------------+---------------------------+-------------+----------------+ |??? pms? ? |????????? UP | /var/mountbroker-root(OK) | geoacct(OK) | geoacct(j, n)? | | localhost |????????? UP | /var/mountbroker-root(OK) | geoacct(OK) | geoacct(n, j)? | +-----------+-------------+---------------------------+-------------+----------------+ restarted glusterd ssh-keyed, no-password access established. gluster system:: execute gsec_create reports success, and /var/lib/glusterd/geo-replication/common_secret.pem.pub exists on both systems, containing 4 lines. created geo-rep session, and configured ssh-port, as before, successful. Issued start command. Status report still says Created. PRIMARY NODE??? PRIMARY VOL??? PRIMARY BRICK??? SECONDARY USER??? SECONDARY???????? SECONDARY NODE??? STATUS???? CRAWL STATUS??? LAST_SYNCED???????? ? ------------------------------------------------------------------------------------------------------------------------------------------- pjs ????????? j????????????? /xx/brick/j????? geoacct?????????? geoacct at pms::n??? N/A?????????????? Created??? N/A???????????? N/A Also, I think the source node must be able to reach the gluster slave. Try to mount the slave vol on the master via the fuse client in order to verify the status. Each has mounted the other's volume. A single file, /gluster/j/stuff, is seen by both and is not replicated into /gluster/n. Also, check with the following command (found it in https://access.redhat.com/solutions/2616791 ): sh -x /usr/libexec/glusterfs/gverify.sh masterVol slaveUser slaveHost slaveVol sshPort logFileName That must be the wrong URL, "libexec" doesn't appear there. However, running it with locally-appropriate args: /usr/libexec/glusterfs/gverify.sh j geoacct pms n 6427 /tmp/verify.log ...generates a great deal of regular logging output, logs nothing in /tmp/verify.log, but the shell execution trace made this complaint: shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory So I went looking for directories that might be restricted (it didn't tell me which one it didn't like), thus: ls -ld / /gluster /gluster/? /xx /xx/brick /xx/brick/j find /var/lib/glu* -type d | xargs ls -ld The only directory, on both systems, that was at all restricted was /var/lib/glusterd/geo-replication, so... chmod ugo+rx /var/lib/glusterd/geo-replication ...to take care of that. Again, attempted start, to no effect, the session remains in Created state. I wish there was a single, exhaustive description of "problems causing georep not to initiate." The fact that it reports apparent success without moving forward to Active state is odd, and maddening. What event or state is which part of the process waiting for? Having started (in principle), what is evaluating conditions for moving to Active? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20240831/b8d521ac/attachment.html>
FYI, I will be traveling for the next week, and may not see email much until then. Your questions... On 8/31/24 04:59, Strahil Nikolov wrote:> One silly question: Did you try adding some files on the source volume > after the georep was created ?Yes. I wondered that, too, whether geo-rep would not start simply because there was nothing to do. But yes, there are a few files created after the geo-rep session was created. And status remains just "Created."> Can you share the output of your version of? 'sh -x > /usr/libexec/glusterfs/gverify.sh masterVol slaveUser slaveHost > slaveVol sshPort logFileName' check ?sh -x /usr/libexec/glusterfs/gverify.sh j geoacct pms n 7887 /tmp/logger + BUFFER_SIZE=104857600 + SSH_PORT=7887 ++ gluster --print-logdir + primary_log_file=/var/log/glusterfs/geo-replication/gverify-primarymnt.log ++ gluster --print-logdir + secondary_log_file=/var/log/glusterfs/geo-replication/gverify-secondarymnt.log + main j geoacct pms n 7887 /tmp/logger + log_file=/tmp/logger + inet6+ local cmd_line + local ver + ping_host pms 7887 + '[' 0 -ne 0 ']' + [[ -z '' ]] + ssh -p 7887 -oNumberOfPasswordPrompts=0 -oStrictHostKeyChecking=no geoacct at pms 'echo Testing_Passwordless_SSH' Testing_Passwordless_SSH + '[' 0 -ne 0 ']' ++ cmd_secondary ++ local cmd_line +++ cat ++ cmd_line='function do_verify() { ver=$(gluster --version | head -1 | cut -f2 -d " "); echo $ver; }; source /etc/profile && do_verify;' ++ echo function 'do_verify()' '{' 'ver=$(gluster' --version '|' head -1 '|' cut -f2 -d '"' '");' echo '$ver;' '};' source /etc/profile '&&' 'do_verify;' + cmd_line='function do_verify() { ver=$(gluster --version | head -1 | cut -f2 -d " "); echo $ver; }; source /etc/profile && do_verify;' + [[ -z '' ]] ++ ssh -p 7887 -oNumberOfPasswordPrompts=0 -oStrictHostKeyChecking=no geoacct at pms bash -c ''\''function do_verify() { ver=$(gluster --version | head -1 | cut -f2 -d " "); echo $ver; }; source /etc/profile && do_verify;'\''' + ver=11.1 + '[' -z 11.1 ']' + ERRORS=0 ++ primary_stats j ++ PRIMARYVOL=j ++ local inet6++ local d ++ local i ++ local disk_size ++ local used_size ++ local ver ++ local m_status +++ mktemp -d -t gverify.sh.XXXXXX ++ d=/tmp/gverify.sh.7QpBxa ++ '[' '' = inet6 ']' ++ glusterfs -s localhost '--xlator-option=*dht.lookup-unhashed=off' --volfile-id j -l /var/log/glusterfs/geo-replication/gverify-primarymnt.log /tmp/gverify.sh.7QpBxa +++ get_inode_num /tmp/gverify.sh.7QpBxa +++ local os +++ case `uname -s` in ++++ uname -s +++ os=Linux +++ [[ XLinux = \X\N\e\t\B\S\D ]] ++++ stat -c %i /tmp/gverify.sh.7QpBxa +++ echo 1 ++ i=1 ++ [[ 1 -ne 1 ]] ++ cd /tmp/gverify.sh.7QpBxa +++ disk_usage /tmp/gverify.sh.7QpBxa +++ local os +++ case `uname -s` in +++ awk '{print $2}' ++++ uname -s +++ os=Linux +++ [[ XLinux = \X\N\e\t\B\S\D ]] ++++ df -P -B1 /tmp/gverify.sh.7QpBxa ++++ tail -1 +++ echo localhost:j 286755311616 2869571584 283885740032 2% /tmp/gverify.sh.7QpBxa ++ disk_size=286755311616 +++ disk_usage /tmp/gverify.sh.7QpBxa +++ local os +++ case `uname -s` in +++ awk '{print $3}' ++++ uname -s +++ os=Linux +++ [[ XLinux = \X\N\e\t\B\S\D ]] ++++ df -P -B1 /tmp/gverify.sh.7QpBxa ++++ tail -1 +++ echo localhost:j 286755311616 2869571584 283885740032 2% /tmp/gverify.sh.7QpBxa ++ used_size=2869571584 ++ umount_lazy /tmp/gverify.sh.7QpBxa ++ local os ++ case `uname -s` in +++ uname -s ++ os=Linux ++ [[ XLinux = \X\N\e\t\B\S\D ]] ++ umount -l /tmp/gverify.sh.7QpBxa ++ rmdir /tmp/gverify.sh.7QpBxa +++ gluster --version +++ head -1 +++ cut -f2 -d ' ' ++ ver=11.1 +++ echo 286755311616:2869571584:11.1 ++ m_status=286755311616:2869571584:11.1 ++ echo 286755311616:2869571584:11.1 + primary_data=286755311616:2869571584:11.1 ++ secondary_stats geoacct pms n ++ set -x ++ SECONDARYUSER=geoacct ++ SECONDARYHOST=pms ++ SECONDARYVOL=n ++ local inet6++ local cmd_line ++ local ver ++ local status +++ mktemp -d -t gverify.sh.XXXXXX ++ d=/tmp/gverify.sh.WqWlni ++ '[' '' = inet6 ']' ++ glusterfs '--xlator-option=*dht.lookup-unhashed=off' --volfile-server pms --volfile-id n -l /var/log/glusterfs/geo-replication/gverify-secondarymnt.log /tmp/gverify.sh.WqWlni +++ get_inode_num /tmp/gverify.sh.WqWlni +++ local os +++ case `uname -s` in ++++ uname -s +++ os=Linux +++ [[ XLinux = \X\N\e\t\B\S\D ]] ++++ stat -c %i /tmp/gverify.sh.WqWlni +++ echo 1 ++ i=1 ++ [[ 1 -ne 1 ]] ++ cd /tmp/gverify.sh.WqWlni +++ disk_usage /tmp/gverify.sh.WqWlni +++ local os +++ case `uname -s` in +++ awk '{print $2}' ++++ uname -s +++ os=Linux +++ [[ XLinux = \X\N\e\t\B\S\D ]] ++++ df -P -B1 /tmp/gverify.sh.WqWlni ++++ tail -1 +++ echo pms:n 286755311616 2868690944 283886620672 2% /tmp/gverify.sh.WqWlni ++ disk_size=286755311616 +++ disk_usage /tmp/gverify.sh.WqWlni +++ local os +++ awk '{print $3}' +++ case `uname -s` in ++++ uname -s +++ os=Linux +++ [[ XLinux = \X\N\e\t\B\S\D ]] ++++ df -P -B1 /tmp/gverify.sh.WqWlni ++++ tail -1 +++ echo pms:n 286755311616 2868690944 283886620672 2% /tmp/gverify.sh.WqWlni ++ used_size=2868690944 +++ find /tmp/gverify.sh.WqWlni -maxdepth 1 -path /tmp/gverify.sh.WqWlni/.trashcan -prune -o -path /tmp/gverify.sh.WqWlni -o -print0 -quit ++ no_of_files++ umount_lazy /tmp/gverify.sh.WqWlni ++ local os ++ case `uname -s` in +++ uname -s ++ os=Linux ++ [[ XLinux = \X\N\e\t\B\S\D ]] ++ umount -l /tmp/gverify.sh.WqWlni ++ rmdir /tmp/gverify.sh.WqWlni +++ cmd_secondary +++ local cmd_line ++++ cat +++ cmd_line='function do_verify() { ver=$(gluster --version | head -1 | cut -f2 -d " "); echo $ver; }; source /etc/profile && do_verify;' +++ echo function 'do_verify()' '{' 'ver=$(gluster' --version '|' head -1 '|' cut -f2 -d '"' '");' echo '$ver;' '};' source /etc/profile '&&' 'do_verify;' ++ cmd_line='function do_verify() { ver=$(gluster --version | head -1 | cut -f2 -d " "); echo $ver; }; source /etc/profile && do_verify;' +++ SSHM geoacct at pms bash -c ''\''function do_verify() { ver=$(gluster --version | head -1 | cut -f2 -d " "); echo $ver; }; source /etc/profile && do_verify;'\''' +++ [[ -z '' ]] +++ ssh -p 7887 -q -oPasswordAuthentication=no -oStrictHostKeyChecking=no -oControlMaster=yes geoacct at pms bash -c ''\''function do_verify() { ver=$(gluster --version | head -1 | cut -f2 -d " "); echo $ver; }; source /etc/profile && do_verify;'\''' shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory ++ ver=11.1 ++ status=286755311616:2868690944:11.1: ++ echo 286755311616:2868690944:11.1: ++ set +x + secondary_data=286755311616:2868690944:11.1: ++ echo 286755311616:2869571584:11.1 ++ cut -f1 -d: + primary_disk_size=286755311616 ++ echo 286755311616:2868690944:11.1: ++ cut -f1 -d: + secondary_disk_size=286755311616 ++ echo 286755311616:2869571584:11.1 ++ cut -f2 -d: + primary_used_size=2869571584 ++ echo 286755311616:2868690944:11.1: ++ cut -f2 -d: + secondary_used_size=2868690944 ++ echo 286755311616:2869571584:11.1 ++ cut -f3 -d: + primary_version=11.1 ++ echo 286755311616:2868690944:11.1: ++ cut -f3 -d: + secondary_version=11.1 ++ echo 286755311616:2868690944:11.1: ++ cut -f4 -d: + secondary_no_of_files+ [[ x286755311616 = \x ]] + [[ x11.1 = \x ]] + [[ 286755311616 -eq 0 ]] + [[ x286755311616 = \x ]] + [[ x11.1 = \x ]] + [[ 286755311616 -eq 0 ]] + '[' 286755311616 -lt 286755311616 ']' + effective_primary_used_size=2974429184 + secondary_available_size=283886620672 + primary_available_size=283780882432 + '[' 283886620672 -lt 283780882432 ']' + '[' '!' -z ']' + [[ 11.1 != 11.1 ]] + exit 0 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20240901/1d52bec7/attachment.html>