Don Ky
2015-Aug-10 21:38 UTC
[Gluster-users] Cannot get geo-replicate working with Gluster 3.7
Hello all, I've been struggling to get gluster-geo replicate functionality working for the last couple of days. I keep getting the following errors: 2015-08-10 17:27:07.855817] E [resource(/gluster/volume1):222:errlog] Popen: command "ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-Cnh7xL/ee1e6b6c8823302e93454e632bd81fbe.sock root at gluster02.example.com /nonexistent/gsyncd --session-owner 50600483-7aa3-4fab-a66c-63350af607b0 -N --listen --timeout 120 gluster://localhost:volume1-replicate" returned with 127, saying: [2015-08-10 17:27:07.856066] E [resource(/gluster/volume1):226:logerr] Popen: ssh> bash: /nonexistent/gsyncd: No such file or directory [2015-08-10 17:27:07.856441] I [syncdutils(/gluster/volume1):220:finalize] <top>: exiting. [2015-08-10 17:27:07.858120] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF. [2015-08-10 17:27:07.858361] I [syncdutils(agent):220:finalize] <top>: exiting. [2015-08-10 17:27:07.858211] I [monitor(monitor):274:monitor] Monitor: worker(/gluster/volume1) died before establishing connection [2015-08-10 17:27:18.181344] I [monitor(monitor):221:monitor] Monitor: ------------------------------------------------------------ [2015-08-10 17:27:18.181842] I [monitor(monitor):222:monitor] Monitor: starting gsyncd worker [2015-08-10 17:27:18.387790] I [gsyncd(/gluster/volume1):649:main_i] <top>: syncing: gluster://localhost:volume1 -> ssh://root at gluster02.example.com: gluster://localhost:volume1-replicate [2015-08-10 17:27:18.389427] D [gsyncd(agent):643:main_i] <top>: rpc_fd: '7,11,10,9' [2015-08-10 17:27:18.390553] I [changelogagent(agent):75:__init__] ChangelogAgent: Agent listining... [2015-08-10 17:27:18.418788] D [repce(/gluster/volume1):191:push] RepceClient: call 8460:140341431777088:1439242038.42 __repce_version__() ... [2015-08-10 17:27:18.629983] E [syncdutils(/gluster/volume1):252:log_raise_exception] <top>: connection to peer is broken [2015-08-10 17:27:18.630651] W [syncdutils(/gluster/volume1):256:log_raise_exception] <top>: !!!!!!!!!!!!! [2015-08-10 17:27:18.630794] W [syncdutils(/gluster/volume1):257:log_raise_exception] <top>: !!! getting "No such file or directory" errors is most likely due to MISCONFIGURATION, please consult https://access.redhat.com/site/documentation/en-US/Red_Hat_Storage/2.1/html/Administration_Guide/chap-User_Guide-Geo_Rep-Preparation-Settingup_Environment.html [2015-08-10 17:27:18.630929] W [syncdutils(/gluster/volume1):265:log_raise_exception] <top>: !!!!!!!!!!!!! [2015-08-10 17:27:18.631129] E [resource(/gluster/volume1):222:errlog] Popen: command "ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-RPuEyN/ee1e6b6c8823302e93454e632bd81fbe.sock root at gluster02.example.com /nonexistent/gsyncd --session-owner 50600483-7aa3-4fab-a66c-63350af607b0 -N --listen --timeout 120 gluster://localhost:volume1-replicate" returned with 127, saying: [2015-08-10 17:27:18.631280] E [resource(/gluster/volume1):226:logerr] Popen: ssh> bash: /nonexistent/gsyncd: No such file or directory [2015-08-10 17:27:18.631567] I [syncdutils(/gluster/volume1):220:finalize] <top>: exiting. [2015-08-10 17:27:18.633125] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF. [2015-08-10 17:27:18.633183] I [monitor(monitor):274:monitor] Monitor: worker(/gluster/volume1) died before establishing connection [2015-08-10 17:27:18.633392] I [syncdutils(agent):220:finalize] <top>: exiting. and the status is continuously faulty: [root at neptune volume1]# gluster volume geo-replication volume01 gluster02::volume01-replicate status MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED ------------------------------------------------------------------------------------------------------------------------------------------------------- neptune volume01 /gluster/volume01 root gluster02::volume01-replicate N/A Faulty N/A N/A What I'm trying to accomplish is to mirror a volume from gluster01 (master) to gluster02 (slave). Here is a break down of the steps I took yum -y install glusterfs-server glusterfs-geo-replication service glusterd start #gluster01 gluster volume create volume1 gluster01.example.com:/gluster/volume1 gluster volume start volume1 #gluster02 gluster volume create volume1-replicate gluster02.example.com: /gluster/volume1-replicate gluster volume start volume1-replicate #geo replicate gluster system:: execute gsec_create #gluster01 gluster volume geo-replication volume1 gluster02::volume1-replicate create push-pem gluster volume geo-replication volume1 gluster02::volume1-replicate start gluster volume geo-replication volume1 gluster02::volume1-replicate status #mouting and testing mkdir /mnt/gluster mount -t glusterfs gluster01.example.com:/volume1 /mnt/gluster mount -t glusterfs gluster02.example.com:/volume1-replicate /mnt/gluster #troubleshooting gluster volume geo-replication volume1 gluster02::volume1-replicate config log-level DEBUG service glusterd restart gluster volume geo-replication volume1 gluster02::volume1-replicate config There was one step before running gluster volume geo-replication volume1 gluster02::volume1-replicate create push-pem I copied the secret.pub to gluster02(the slave) and added it to .ssh/authorized_keys. I can ssh as root from gluster01 to gluster02 fine. I'm currently running: glusterfs-3.7.3-1.el7.x86_64 glusterfs-cli-3.7.3-1.el7.x86_64 glusterfs-libs-3.7.3-1.el7.x86_64 glusterfs-client-xlators-3.7.3-1.el7.x86_64 glusterfs-fuse-3.7.3-1.el7.x86_64 glusterfs-server-3.7.3-1.el7.x86_64 glusterfs-api-3.7.3-1.el7.x86_64 glusterfs-geo-replication-3.7.3-1.el7.x86_64 on both slave and master servers. Both servers have ntp installed are in sync and patched. I can mount volume1 or volume1-replicate on each host and confirmed that iptables have been flushed. Not sure exactly what else to check at this point. There appeared to be another user with similar errors but the mailing list says he resolved it on his own. Any ideas? I'm completely lost on what could be issue. Some of the redhat docs mentioned it could be fuse but it looks like fuse is installed as part of gluster. Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150810/60ac84d2/attachment.html>
Aravinda
2015-Aug-12 05:41 UTC
[Gluster-users] Cannot get geo-replicate working with Gluster 3.7
This looks like SSH configuration issue. Please cleanup all the lines in /root/.ssh/authorized_keys which are connected from Master nodes which do not starts with "command=" Please let us know, which ssh key is used to create passwordless SSH from Master node to Slave node. To resolve the issue. 1. In "neptune" node, cat /var/lib/glusterd/geo-replication/common_secret.pem.pub 2. Open /root/.ssh/authorized_keys file in "volume01-replicate" node, and see the keys present or not(Output from previous step) 3. There should not be any other lines which has the same key without "command=" in the beginning For example ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDhR4kp978pjze9y1ozySB6jgz2VjeLKnCWIIsZ7NFCue1S4lCU7TgNg2g8FwfXR7LX4mRuLFtQeOkEN9kLGaiJZiN06oU2Jz3Y2gx6egxR5lGiumMg7QLPH1PQPJIfT8Qaz1znH+NlpM1BuivjfOsbtVWTBQpANq4uA8ooln2rLTKIzRGQrS6adUD6KwbjIpVEahJqkZf8YaiaTDJZdXdGGvT6YtytogPmuKwrJ+XujaRd49dDcjeOrcjkFxsf9/IuqBvbZYwW2hwTcqqtSHZfIwHaf6X9fhDizVX4WxPhToiK9LZaEF57hnPAa7bl2if9KFoOyfwZByTIwQPqjymv root at neptune If exists, remove that line. 4. Try to connect from "neptune" node using following command to see the issue is resolved ssh -i /var/lib/glusterd/geo-replication/secret.pem root at volume01-replicate If you see gsyncd messages then everything is normal. If you see ssh errors then issue is not resolved. Let us know if any questions. regards Aravinda On 08/11/2015 03:08 AM, Don Ky wrote:> Hello all, > > I've been struggling to get gluster-geo replicate functionality > working for the last couple of days. I keep getting the following errors: > > > 2015-08-10 17:27:07.855817] E [resource(/gluster/volume1):222:errlog] > Popen: command "ssh -oPasswordAuthentication=no > -oStrictHostKeyChecking=no -i > /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S > /tmp/gsyncd-aux-ssh-Cnh7xL/ee1e6b6c8823302e93454e632bd81fbe.sock > root at gluster02.example.com <mailto:root at gluster02.example.com> > /nonexistent/gsyncd --session-owner > 50600483-7aa3-4fab-a66c-63350af607b0 -N --listen --timeout 120 > gluster://localhost:volume1-replicate" returned with 127, saying: > [2015-08-10 17:27:07.856066] E [resource(/gluster/volume1):226:logerr] > Popen: ssh> bash: /nonexistent/gsyncd: No such file or directory > [2015-08-10 17:27:07.856441] I > [syncdutils(/gluster/volume1):220:finalize] <top>: exiting. > [2015-08-10 17:27:07.858120] I [repce(agent):92:service_loop] > RepceServer: terminating on reaching EOF. > [2015-08-10 17:27:07.858361] I [syncdutils(agent):220:finalize] <top>: > exiting. > [2015-08-10 17:27:07.858211] I [monitor(monitor):274:monitor] Monitor: > worker(/gluster/volume1) died before establishing connection > [2015-08-10 17:27:18.181344] I [monitor(monitor):221:monitor] Monitor: > ------------------------------------------------------------ > [2015-08-10 17:27:18.181842] I [monitor(monitor):222:monitor] Monitor: > starting gsyncd worker > [2015-08-10 17:27:18.387790] I [gsyncd(/gluster/volume1):649:main_i] > <top>: syncing: gluster://localhost:volume1 -> > ssh://root at gluster02.example.com:gluster://localhost:volume1-replicate > [2015-08-10 17:27:18.389427] D [gsyncd(agent):643:main_i] <top>: > rpc_fd: '7,11,10,9' > [2015-08-10 17:27:18.390553] I [changelogagent(agent):75:__init__] > ChangelogAgent: Agent listining... > [2015-08-10 17:27:18.418788] D [repce(/gluster/volume1):191:push] > RepceClient: call 8460:140341431777088:1439242038.42 > __repce_version__() ... > [2015-08-10 17:27:18.629983] E > [syncdutils(/gluster/volume1):252:log_raise_exception] <top>: > connection to peer is broken > [2015-08-10 17:27:18.630651] W > [syncdutils(/gluster/volume1):256:log_raise_exception] <top>: > !!!!!!!!!!!!! > [2015-08-10 17:27:18.630794] W > [syncdutils(/gluster/volume1):257:log_raise_exception] <top>: !!! > getting "No such file or directory" errors is most likely due to > MISCONFIGURATION, please consult > https://access.redhat.com/site/documentation/en-US/Red_Hat_Storage/2.1/html/Administration_Guide/chap-User_Guide-Geo_Rep-Preparation-Settingup_Environment.html > [2015-08-10 17:27:18.630929] W > [syncdutils(/gluster/volume1):265:log_raise_exception] <top>: > !!!!!!!!!!!!! > [2015-08-10 17:27:18.631129] E [resource(/gluster/volume1):222:errlog] > Popen: command "ssh -oPasswordAuthentication=no > -oStrictHostKeyChecking=no -i > /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S > /tmp/gsyncd-aux-ssh-RPuEyN/ee1e6b6c8823302e93454e632bd81fbe.sock > root at gluster02.example.com <mailto:root at gluster02.example.com> > /nonexistent/gsyncd --session-owner > 50600483-7aa3-4fab-a66c-63350af607b0 -N --listen --timeout 120 > gluster://localhost:volume1-replicate" returned with 127, saying: > [2015-08-10 17:27:18.631280] E [resource(/gluster/volume1):226:logerr] > Popen: ssh> bash: /nonexistent/gsyncd: No such file or directory > [2015-08-10 17:27:18.631567] I > [syncdutils(/gluster/volume1):220:finalize] <top>: exiting. > [2015-08-10 17:27:18.633125] I [repce(agent):92:service_loop] > RepceServer: terminating on reaching EOF. > [2015-08-10 17:27:18.633183] I [monitor(monitor):274:monitor] Monitor: > worker(/gluster/volume1) died before establishing connection > [2015-08-10 17:27:18.633392] I [syncdutils(agent):220:finalize] <top>: > exiting. > > and the status is continuously faulty: > > [root at neptune volume1]# gluster volume geo-replication volume01 > gluster02::volume01-replicate status > > MASTER NODE MASTER VOL MASTER BRICK SLAVE USER > SLAVE SLAVE NODE STATUS CRAWL STATUS > LAST_SYNCED > ------------------------------------------------------------------------------------------------------------------------------------------------------- > neptune volume01 /gluster/volume01 root > gluster02::volume01-replicate N/A Faulty N/A > N/A > > What I'm trying to accomplish is to mirror a volume from gluster01 > (master) to gluster02 (slave). > > Here is a break down of the steps I took > > yum -y install glusterfs-server glusterfs-geo-replication > service glusterd start > > #gluster01 > gluster volume create volume1 gluster01.example.com:/gluster/volume1 > gluster volume start volume1 > > #gluster02 > gluster volume create volume1-replicate > gluster02.example.com:/gluster/volume1-replicate > gluster volume start volume1-replicate > > > #geo replicate > gluster system:: execute gsec_create > > #gluster01 > gluster volume geo-replication volume1 gluster02::volume1-replicate > create push-pem > gluster volume geo-replication volume1 gluster02::volume1-replicate start > gluster volume geo-replication volume1 gluster02::volume1-replicate status > > #mouting and testing > mkdir /mnt/gluster > mount -t glusterfs gluster01.example.com:/volume1 /mnt/gluster > mount -t glusterfs gluster02.example.com:/volume1-replicate /mnt/gluster > > #troubleshooting > gluster volume geo-replication volume1 gluster02::volume1-replicate > config log-level DEBUG > service glusterd restart > > gluster volume geo-replication volume1 gluster02::volume1-replicate config > > There was one step before running > > gluster volume geo-replication volume1 gluster02::volume1-replicate > create push-pem > > I copied the secret.pub to gluster02(the slave) and added it to > .ssh/authorized_keys. I can ssh as root from gluster01 to gluster02 fine. > > I'm currently running: > > glusterfs-3.7.3-1.el7.x86_64 > glusterfs-cli-3.7.3-1.el7.x86_64 > glusterfs-libs-3.7.3-1.el7.x86_64 > glusterfs-client-xlators-3.7.3-1.el7.x86_64 > glusterfs-fuse-3.7.3-1.el7.x86_64 > glusterfs-server-3.7.3-1.el7.x86_64 > glusterfs-api-3.7.3-1.el7.x86_64 > glusterfs-geo-replication-3.7.3-1.el7.x86_64 > > on both slave and master servers. Both servers have ntp installed are > in sync and patched. > > I can mount volume1 or volume1-replicate on each host and confirmed > that iptables have been flushed. > > Not sure exactly what else to check at this point. There appeared to > be another user with similar errors but the mailing list says he > resolved it on his own. > > Any ideas? I'm completely lost on what could be issue. Some of the > redhat docs mentioned it could be fuse but it looks like fuse is > installed as part of gluster. > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150812/5f28711e/attachment.html>