Tony Maro
2013-Jul-26 14:38 UTC
[Gluster-users] Geo-replication fails to even TRY to start doesnt create /tmp/gsyncd-* dir
Setting up Geo-replication with an existing 3 TB of data is turning out to be a huge pain. It was working for a bit but would go faulty by the time it hit 1TB synced. Multiple attempts resulted in the same thing. Now, I don't know what's changed, but it never actually tries to log into the remote server anymore. Checking "last" logs on the destination shows that it never actually attempts to make the SSH connection. The geo-replication command is as such: gluster volume geo-replication docstore1 root at backup-ds2.gluster:/data/docstore1 start>From the log:[2013-07-26 10:26:04.317667] I [gsyncd:354:main_i] <top>: syncing: gluster://localhost:docstore1 -> ssh://root at backup-ds2.gluster :/data/docstore1 [2013-07-26 10:26:08.258853] I [syncdutils(monitor):142:finalize] <top>: exiting. [2013-07-26 10:26:08.259452] E [syncdutils:173:log_raise_exception] <top>: connection to peer is broken *[2013-07-26 10:26:08.260386] E [resource:191:errlog] Popen: command "ssh -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-WlTfNb/gsycnd-ssh-%r@%h:%p root at backup-ds2.gluster /usr/lib/glusterfs/glusterfs/gsyncd --session-owner 24f8c92d-723e-4513-9593-40ef4b7e766a -N --listen --timeout 120 file:///data/docstore1" returned with 143* When I attempt to run the SSH command from the logs directly in the console, ssh replies with: muxserver_listen bind(): No such file or directory And, there's no gsyncd temp directory where specified. If I manually create that directory and re-run the same command it works. The problem of course is that the tmp directory is randomly named and starting Gluster geo-rep again will result in a new directory it tries to use. Running Gluster 3.3.1-ubuntu1~precise9 Any ideas why this would be happening? I did find that my Ubuntu packages were trying to access gsyncd in the wrong path so I corrected things. I've also got auto-ssh login using root so I changed my ssh command (and my global ssh config) to make sure the options would work. Here's the important geo-rep configs: ssh_command: ssh remote_gsyncd: /usr/lib/glusterfs/glusterfs/gsyncd gluster_command_dir: /usr/sbin/ gluster_params: xlator-option=*-dht.assert-no-child-down=true Thanks, Tony -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130726/159755e8/attachment.html>
Tony Maro
2013-Jul-26 14:42 UTC
[Gluster-users] Geo-replication fails to even TRY to start doesnt create /tmp/gsyncd-* dir
Correction: Manually running the command after creating the temp directory actually doesn't work, but it doesn't error out it just hangs and never connects to the remote server. Dunno if this is something within gsyncd or what... On Fri, Jul 26, 2013 at 10:38 AM, Tony Maro <tonym at evrichart.com> wrote:> Setting up Geo-replication with an existing 3 TB of data is turning out to > be a huge pain. > > It was working for a bit but would go faulty by the time it hit 1TB > synced. Multiple attempts resulted in the same thing. > > Now, I don't know what's changed, but it never actually tries to log into > the remote server anymore. Checking "last" logs on the destination shows > that it never actually attempts to make the SSH connection. The > geo-replication command is as such: > > gluster volume geo-replication docstore1 root at backup-ds2.gluster:/data/docstore1 > start > > From the log: > > [2013-07-26 10:26:04.317667] I [gsyncd:354:main_i] <top>: syncing: > gluster://localhost:docstore1 -> ssh://root at backup-ds2.gluster > :/data/docstore1 > [2013-07-26 10:26:08.258853] I [syncdutils(monitor):142:finalize] <top>: > exiting. > [2013-07-26 10:26:08.259452] E [syncdutils:173:log_raise_exception] <top>: > connection to peer is broken > *[2013-07-26 10:26:08.260386] E [resource:191:errlog] Popen: command "ssh > -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-WlTfNb/gsycnd-ssh-%r@%h:%p > root at backup-ds2.gluster /usr/lib/glusterfs/glusterfs/gsyncd > --session-owner 24f8c92d-723e-4513-9593-40ef4b7e766a -N --listen --timeout > 120 file:///data/docstore1" returned with 143* > > When I attempt to run the SSH command from the logs directly in the > console, ssh replies with: > > muxserver_listen bind(): No such file or directory > > And, there's no gsyncd temp directory where specified. If I manually > create that directory and re-run the same command it works. The problem of > course is that the tmp directory is randomly named and starting Gluster > geo-rep again will result in a new directory it tries to use. > > Running Gluster 3.3.1-ubuntu1~precise9 > > Any ideas why this would be happening? I did find that my Ubuntu packages > were trying to access gsyncd in the wrong path so I corrected things. I've > also got auto-ssh login using root so I changed my ssh command (and my > global ssh config) to make sure the options would work. Here's the > important geo-rep configs: > > ssh_command: ssh > remote_gsyncd: /usr/lib/glusterfs/glusterfs/gsyncd > gluster_command_dir: /usr/sbin/ > gluster_params: xlator-option=*-dht.assert-no-child-down=true > > Thanks, > Tony >-- Thanks, *Tony Maro* Chief Information Officer EvriChart ? www.evrichart.com Advanced Records Management Office | 888.801.2020 ? 304.536.1290 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130726/83c2aae1/attachment.html>