Danny Sauer
2014-Feb-25 14:45 UTC
[Gluster-users] Can't stop (or control) geo-replication?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I have the current gluster 3.4 running on some RHEL6 systems. For some reason, all of the geo-replication commands which change a config file (start, stop, config) return failure. Despite this, "start" actually starts it up. I'd be mostly ok with this if stop also actually stopped it; but that does not happen. The "command failed" behavior is consistent across all nodes. The binaries are the result of downloading the source RPM and "rpm --rebuild"ing, since the packages on the download server still don't install on anything but the latest RHEL6 (that ssl library dependency thing); I didn't change anything, just directly rebuilt from the source package. I have working ssh between the systems, and files do propagate over; I can see in the logs that ssh does connect and start up the gsyncd. I just have several test configs that I'd like to not have running now, but they won't stay dead. :) Is there a way to forcibly remove several geo-replication configs outside of the shell tool? I tried editing the config file to change the ssh command path for one of them, and my changes kept getting overwritten by metadata from the other nodes (yes, time is in sync on all nodes using ntp against the same server), so I'm assuming that deleting the relevant block from the config file won't do it? The really weird thing is that other volume management tasks work fine; I can add/remove bricks from volumes, create, start and stop regular volumes, etc. It's just the geo-replication management part that fails. Thanks for any input you can provide. :) Some example output (with username, IP, and hostnames changed to protect the innocent) is below. - --Danny user at gluster1 [/home/user] $ sudo gluster v geo sec ssh://slave_73::geo_sec_73 stop geo-replication command failed user at gluster1 [/home/user] $ sudo gluster v geo sec ssh://slave_73::geo_sec_73 config gluster_log_file: /var/log/glusterfs/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.gluster.log ssh_command: ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem session_owner: ace6b109-ba88-4c2e-9381-f2fc31aa36b5 remote_gsyncd: /usr/libexec/glusterfs/gsyncd socketdir: /var/run state_file: /var/lib/glusterd/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.status state_socket_unencoded: /var/lib/glusterd/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.socket gluster_command_dir: /usr/sbin/ pid_file: /var/lib/glusterd/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.pid log_file: /var/log/glusterfs/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.log gluster_params: xlator-option=*-dht.assert-no-child-down=true user at gluster1 [/home/user] $ sudo gluster v geo sec ssh://slave_73::geo_sec_73 status NODE MASTER SLAVE STATUS - --------------------------------------------------------------------------------------------------- gluster1 sec ssh://slave_73::geo_sec_73 faulty user at gluster1 [/home/user] $ sudo gluster v geo sec ssh://slave_73::geo_sec_73 stop geo-replication command failed user at gluster1 [/home/user] $ sudo gluster v geo sec ssh://slave_73::geo_sec_73 status NODE MASTER SLAVE STATUS - --------------------------------------------------------------------------------------------------- gluster1 sec ssh://slave_73::geo_sec_73 faulty -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlMMrHEACgkQvtwZjjd2PN8kpQCfVjtKeO7DCvhT9SpK+LEulZVZ c0wAn16xAT14V+oNOilbKwHDoM68EIbW =QfSZ -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140225/86b3eb1a/attachment.html>
Steve Dainard
2014-Apr-24 15:04 UTC
[Gluster-users] Can't stop (or control) geo-replication?
Hi Danny, Did you get anywhere with this geo-rep issue? I have a similar problem running on CentOS 6.5 when trying anything other than 'start' with geo-rep. Thanks, *Steve * On Tue, Feb 25, 2014 at 9:45 AM, Danny Sauer <danny at dannysauer.com> wrote:> > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > I have the current gluster 3.4 running on some RHEL6 systems. For some > reason, all of the geo-replication commands which change a config file > (start, stop, config) return failure. Despite this, "start" actually > starts it up. I'd be mostly ok with this if stop also actually stopped it; > but that does not happen. The "command failed" behavior is consistent > across all nodes. The binaries are the result of downloading the source > RPM and "rpm --rebuild"ing, since the packages on the download server still > don't install on anything but the latest RHEL6 (that ssl library dependency > thing); I didn't change anything, just directly rebuilt from the source > package. I have working ssh between the systems, and files do propagate > over; I can see in the logs that ssh does connect and start up the gsyncd. > I just have several test configs that I'd like to not have running now, but > they won't stay dead. :) > > Is there a way to forcibly remove several geo-replication configs outside > of the shell tool? I tried editing the config file to change the ssh > command path for one of them, and my changes kept getting overwritten by > metadata from the other nodes (yes, time is in sync on all nodes using ntp > against the same server), so I'm assuming that deleting the relevant block > from the config file won't do it? > > The really weird thing is that other volume management tasks work fine; I > can add/remove bricks from volumes, create, start and stop regular volumes, > etc. It's just the geo-replication management part that fails. > > Thanks for any input you can provide. :) Some example output (with > username, IP, and hostnames changed to protect the innocent) is below. > > - --Danny > > > user at gluster1 [/home/user] > $ sudo gluster v geo sec ssh://slave_73::geo_sec_73 stop > > geo-replication command failed > user at gluster1 [/home/user] > $ sudo gluster v geo sec ssh://slave_73::geo_sec_73 config > gluster_log_file: > /var/log/glusterfs/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.gluster.log > ssh_command: ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i > /var/lib/glusterd/geo-replication/secret.pem > session_owner: ace6b109-ba88-4c2e-9381-f2fc31aa36b5 > remote_gsyncd: /usr/libexec/glusterfs/gsyncd > socketdir: /var/run > state_file: > /var/lib/glusterd/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.status > state_socket_unencoded: > /var/lib/glusterd/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.socket > gluster_command_dir: /usr/sbin/ > pid_file: > /var/lib/glusterd/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.pid > log_file: > /var/log/glusterfs/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.log > gluster_params: xlator-option=*-dht.assert-no-child-down=true > user at gluster1 [/home/user] > $ sudo gluster v geo sec ssh://slave_73::geo_sec_73 status > NODE MASTER > SLAVE STATUS > - > --------------------------------------------------------------------------------------------------- > gluster1 sec > ssh://slave_73::geo_sec_73 faulty > user at gluster1 [/home/user] > $ sudo gluster v geo sec ssh://slave_73::geo_sec_73 stop > > geo-replication command failed > user at gluster1 [/home/user] > $ sudo gluster v geo sec ssh://slave_73::geo_sec_73 status > NODE MASTER > SLAVE STATUS > - > --------------------------------------------------------------------------------------------------- > gluster1 sec > ssh://slave_73::geo_sec_73 faulty > > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.14 (GNU/Linux) > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ > > iEYEARECAAYFAlMMrHEACgkQvtwZjjd2PN8kpQCfVjtKeO7DCvhT9SpK+LEulZVZ > c0wAn16xAT14V+oNOilbKwHDoM68EIbW > =QfSZ > -----END PGP SIGNATURE----- > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140424/61ac91c4/attachment.html>
Venky Shankar
2014-Apr-25 05:54 UTC
[Gluster-users] Can't stop (or control) geo-replication?
anything in glusterd log file? glusterd in debug mode (glusterd -LDEBUG) would provide more logs to debug. On Tue, Feb 25, 2014 at 8:15 PM, Danny Sauer <danny at dannysauer.com> wrote:> > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > I have the current gluster 3.4 running on some RHEL6 systems. For some > reason, all of the geo-replication commands which change a config file > (start, stop, config) return failure. Despite this, "start" actually > starts it up. I'd be mostly ok with this if stop also actually stopped it; > but that does not happen. The "command failed" behavior is consistent > across all nodes. The binaries are the result of downloading the source > RPM and "rpm --rebuild"ing, since the packages on the download server still > don't install on anything but the latest RHEL6 (that ssl library dependency > thing); I didn't change anything, just directly rebuilt from the source > package. I have working ssh between the systems, and files do propagate > over; I can see in the logs that ssh does connect and start up the gsyncd. > I just have several test configs that I'd like to not have running now, but > they won't stay dead. :) > > Is there a way to forcibly remove several geo-replication configs outside > of the shell tool? I tried editing the config file to change the ssh > command path for one of them, and my changes kept getting overwritten by > metadata from the other nodes (yes, time is in sync on all nodes using ntp > against the same server), so I'm assuming that deleting the relevant block > from the config file won't do it? > > The really weird thing is that other volume management tasks work fine; I > can add/remove bricks from volumes, create, start and stop regular volumes, > etc. It's just the geo-replication management part that fails. > > Thanks for any input you can provide. :) Some example output (with > username, IP, and hostnames changed to protect the innocent) is below. > > - --Danny > > > user at gluster1 [/home/user] > $ sudo gluster v geo sec ssh://slave_73::geo_sec_73 stop > > geo-replication command failed > user at gluster1 [/home/user] > $ sudo gluster v geo sec ssh://slave_73::geo_sec_73 config > gluster_log_file: > /var/log/glusterfs/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.gluster.log > ssh_command: ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i > /var/lib/glusterd/geo-replication/secret.pem > session_owner: ace6b109-ba88-4c2e-9381-f2fc31aa36b5 > remote_gsyncd: /usr/libexec/glusterfs/gsyncd > socketdir: /var/run > state_file: > /var/lib/glusterd/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.status > state_socket_unencoded: > /var/lib/glusterd/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.socket > gluster_command_dir: /usr/sbin/ > pid_file: > /var/lib/glusterd/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.pid > log_file: > /var/log/glusterfs/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.log > gluster_params: xlator-option=*-dht.assert-no-child-down=true > user at gluster1 [/home/user] > $ sudo gluster v geo sec ssh://slave_73::geo_sec_73 status > NODE MASTER > SLAVE STATUS > - > --------------------------------------------------------------------------------------------------- > gluster1 sec > ssh://slave_73::geo_sec_73 faulty > user at gluster1 [/home/user] > $ sudo gluster v geo sec ssh://slave_73::geo_sec_73 stop > > geo-replication command failed > user at gluster1 [/home/user] > $ sudo gluster v geo sec ssh://slave_73::geo_sec_73 status > NODE MASTER > SLAVE STATUS > - > --------------------------------------------------------------------------------------------------- > gluster1 sec > ssh://slave_73::geo_sec_73 faulty > > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.14 (GNU/Linux) > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ > > iEYEARECAAYFAlMMrHEACgkQvtwZjjd2PN8kpQCfVjtKeO7DCvhT9SpK+LEulZVZ > c0wAn16xAT14V+oNOilbKwHDoM68EIbW > =QfSZ > -----END PGP SIGNATURE----- > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140425/10e03f3f/attachment.html>