Doug Wilson
2014-Jun-19 21:53 UTC
[Gluster-users] 3.5.0 distributed geo-replication & replica failure
Halloo, My understanding is that if the active 3.5.0 geo-replication server goes off-line, after approximately 60 seconds, the passive node should recognize that and switch to active and pick up replicating data where the former active geo-rep server left off. That's not working for me. The passive node never starts geo-replicating. "gluster volume <master_volume> <slave_node>::<slave_volume> status detail" from the passive node shows it just sitting there with status "Not Started" forever. As expected, the former active geo-rep server disappears from that command's output when I shut it down. Passwordless ssh connectivity and gverify.sh work from the passive node to the slave. Any advice on troubleshooting this? Thanks much, Doug -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140619/a3dd72a5/attachment.html>
Doug Wilson
2014-Jun-20 20:54 UTC
[Gluster-users] 3.5.0 distributed geo-replication & replica failure
Figured out my issue. Posting here just in case it's helpful to someone else some day. This was because passwordless ssh for the passive slave really wasn't setup properly. An invalid command restriction entry got put into the common_secret.pem.pub file. Once I fixed that and copied it's contents to /root/.ssh/authorized_keys on the slave it began working, and the geo-rep satus showed it as "Passive." The first line of bad authorized_key entry looked something like: command="/usr/libexec/glusterfs/gsyncd" command="/usr/lib/x86_64-linux-gnu/glusterfs/gsyncd" ssh-rsa AAAAB3NzaC Changing it to command="/usr/lib/x86_64-linux-gnu/glusterfs/gsyncd" ssh-rsa AAAAB3NzaC , after confirming that was the path to gsyncd on the slave, fixed the issue. Thanks, Doug On Thu, Jun 19, 2014 at 5:53 PM, Doug Wilson <dwilson at customink.com> wrote:> Halloo, > > My understanding is that if the active 3.5.0 geo-replication server goes > off-line, after approximately 60 seconds, the passive node should recognize > that and switch to active and pick up replicating data where the former > active geo-rep server left off. > > That's not working for me. The passive node never starts geo-replicating. > "gluster volume <master_volume> <slave_node>::<slave_volume> status detail" > from the passive node shows it just sitting there with status "Not Started" > forever. As expected, the former active geo-rep server disappears from that > command's output when I shut it down. > > Passwordless ssh connectivity and gverify.sh work from the passive node to > the slave. > > Any advice on troubleshooting this? > > Thanks much, > > Doug > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140620/ccd1f32e/attachment.html>