thr3ads.net - Gluster users - [Gluster-users] Can't stop (or control) geo-replication? [Apr 2014]

If this information is useful, please help other people find it:
Share via:
Danny Sauer
2014-Apr-24 22:02 UTC
[Gluster-users] Can't stop (or control) geo-replication?

No, I still haven't heard anything from the community, and I just removed
the ssh keys for the broken systems so they don't try to start up the
"bad" replication configs (which is incredibly ugly). Someday soon
I'm planning to build a test cluster to experiment on, though, and will
follow up if I figure out a solution.

--Danny

Steve Dainard <sdainard at miovision.com> wrote:
>Hi Danny,
>
>
>Did you get anywhere with this geo-rep issue? I have a similar problem
running on CentOS 6.5 when trying anything other than 'start' with
geo-rep.
>
>
>Thanks,
>
>
>Steve?
>
>
>On Tue, Feb 25, 2014 at 9:45 AM, Danny Sauer <danny at dannysauer.com>
wrote:
>
>
>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>I have the current gluster 3.4 running on some RHEL6 systems.? For some
reason, all of the geo-replication commands which change a config file (start,
stop, config) return failure.? Despite this, "start" actually starts
it up.? I'd be mostly ok with this if stop also actually stopped it; but
that does not happen.? The "command failed" behavior is consistent
across all nodes.? The binaries are the result of downloading the source RPM and
"rpm --rebuild"ing, since the packages on the download server still
don't install on anything but the latest RHEL6 (that ssl library dependency
thing); I didn't change anything, just directly rebuilt from the source
package.? I have working ssh between the systems, and files do propagate over; I
can see in the logs that ssh does connect and start up the gsyncd.? I just have
several test configs that I'd like to not have running now, but they
won't stay dead. :)
>
>Is there a way to forcibly remove several geo-replication configs outside of
the shell tool?? I tried editing the config file to change the ssh command path
for one of them, and my changes kept getting overwritten by metadata from the
other nodes (yes, time is in sync on all nodes using ntp against the same
server), so I'm assuming that deleting the relevant block from the config
file won't do it?
>
>The really weird thing is that other volume management tasks work fine; I
can add/remove bricks from volumes, create, start and stop regular volumes,
etc.? It's just the geo-replication management part that fails.
>
>Thanks for any input you can provide. :)? Some example output (with
username, IP, and hostnames changed to protect the innocent) is below.
>
>- --Danny
>
>
>user at gluster1 [/home/user]
>$ sudo gluster v geo sec ssh://slave_73::geo_sec_73 stop
>?
>geo-replication command failed
>user at gluster1 [/home/user]
>$ sudo gluster v geo sec ssh://slave_73::geo_sec_73 config
>gluster_log_file:
/var/log/glusterfs/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.gluster.log
>ssh_command: ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i
/var/lib/glusterd/geo-replication/secret.pem
>session_owner: ace6b109-ba88-4c2e-9381-f2fc31aa36b5
>remote_gsyncd: /usr/libexec/glusterfs/gsyncd
>socketdir: /var/run
>state_file:
/var/lib/glusterd/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.status
>state_socket_unencoded:
/var/lib/glusterd/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.socket
>gluster_command_dir: /usr/sbin/
>pid_file:
/var/lib/glusterd/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.pid
>log_file:
/var/log/glusterfs/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.log
>gluster_params: xlator-option=*-dht.assert-no-child-down=true
>user at gluster1 [/home/user]
>$ sudo gluster v geo sec ssh://slave_73::geo_sec_73 status
>NODE???????????????? MASTER??????????????
SLAVE????????????????????????????????????????????? STATUS
>-
---------------------------------------------------------------------------------------------------
>gluster1???????????? sec?????????????????
ssh://slave_73::geo_sec_73???????????????????????? faulty
>user at gluster1 [/home/user]
>$ sudo gluster v geo sec ssh://slave_73::geo_sec_73 stop
>?
>geo-replication command failed
>user at gluster1 [/home/user]
>$ sudo gluster v geo sec ssh://slave_73::geo_sec_73 status
>NODE???????????????? MASTER??????????????
SLAVE????????????????????????????????????????????? STATUS
>-
---------------------------------------------------------------------------------------------------
>gluster1???????????? sec?????????????????
ssh://slave_73::geo_sec_73???????????????????????? faulty
>?
>
>-----BEGIN PGP SIGNATURE-----
>Version: GnuPG v1.4.14 (GNU/Linux)
>Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
>iEYEARECAAYFAlMMrHEACgkQvtwZjjd2PN8kpQCfVjtKeO7DCvhT9SpK+LEulZVZ
>c0wAn16xAT14V+oNOilbKwHDoM68EIbW
>=QfSZ
>-----END PGP SIGNATURE-----
>
>
>_______________________________________________
>Gluster-users mailing list
>Gluster-users at gluster.org
>http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140424/9642df54/attachment.html>
Gluster users - Apr 2014 - Can't stop (or control) geo-replication?

[Gluster-users] Can't stop (or control) geo-replication?