Janvre, Pierre-Marie (Agoda)
2015-Jan-26 08:46 UTC
[Gluster-users] Gluster 3.6.1 Geo-Replication Faulty
Hi All, I am setting up a new Gluster environment between 2 datacenters and Geo-Replication is Faulty. Here is the setup: Datacenter A 2 nodes: -master_node1 -master_node2 1 brick per node (replica) Datacenter B 2 nodes: -slave_node1 -slave_node2 1 brick per node (replica) OS: CentOS 6.6 Gluster: glusterfs 3.6.1 built on Nov 7 2014 15:15:48 Bricks setup properly without any error. Passwordless authentication between node 1 of datacenter 1 and node 1 of datacenter 2 setup successfully. Geo-Replication setup properly as below: gluster system:: execute gsec_create gluster volume geo-replication master_volume root at slave_node1::slave_volume create push-pem I can start successfully the geo-replication: gluster volume geo-replication master_volume root at slave_node1::slave_volume start But when checking the status, I have the following: gluster volume geo-replication master_volume root at slave_node1::slave_volume status MASTER NODE MASTER VOL MASTER BRICK SLAVE STATUS CHECKPOINT STATUS CRAWL STATUS ------------------------------------------------------------------------------------------------------------------------ master_node1 master_volume /master_brick1 root at slave_node1::slave_volume faulty N/A N/A master_node2 master_volume /master_brick2 root at slave_node1::slave_volume faulty N/A N/A>From the master node 1, I run geo-replication logs in debug mode and I found the following:[2015-01-26 15:33:29.247694] D [monitor(monitor):280:distribute] <top>: master bricks: [{'host': 'master_node1', 'dir': '/master_brick1'}, {'host': 'master_node2', 'dir': '/master_brick2'}] [2015-01-26 15:33:29.248047] D [monitor(monitor):286:distribute] <top>: slave SSH gateway: root at slave_node1 [2015-01-26 15:33:29.721532] I [monitor(monitor):296:distribute] <top>: slave bricks: [{'host': 'slave_node1', 'dir': '/slave_brick1'}, {'host': 'slave_node2', 'dir': '/ slave_brick2'}] [2015-01-26 15:33:29.729722] I [monitor(monitor):316:distribute] <top>: worker specs: [('/master_brick1', 'ssh://root at slave_node2:gluster://localhost:slave_volume')] [2015-01-26 15:33:29.730287] I [monitor(monitor):109:set_state] Monitor: new state: Initializing... [2015-01-26 15:33:29.731513] I [monitor(monitor):163:monitor] Monitor: ------------------------------------------------------------ [2015-01-26 15:33:29.731647] I [monitor(monitor):164:monitor] Monitor: starting gsyncd worker [2015-01-26 15:33:29.830656] D [gsyncd(agent):627:main_i] <top>: rpc_fd: '7,10,9,8' [2015-01-26 15:33:29.831882] I [monitor(monitor):214:monitor] Monitor: worker(/master_brick1) died before establishing connection [2015-01-26 15:33:29.831476] I [changelogagent(agent):72:__init__] ChangelogAgent: Agent listining... [2015-01-26 15:33:29.832392] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF. [2015-01-26 15:33:29.832693] I [syncdutils(agent):214:finalize] <top>: exiting. [2015-01-26 15:33:29.834060] I [monitor(monitor):109:set_state] Monitor: new state: faulty [2015-01-26 15:33:39.846858] I [monitor(monitor):163:monitor] Monitor: ------------------------------------------------------------ [2015-01-26 15:33:39.847105] I [monitor(monitor):164:monitor] Monitor: starting gsyncd worker [2015-01-26 15:33:39.941967] D [gsyncd(agent):627:main_i] <top>: rpc_fd: '7,10,9,8' [2015-01-26 15:33:39.942630] I [changelogagent(agent):72:__init__] ChangelogAgent: Agent listining... [2015-01-26 15:33:39.945791] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF. [2015-01-26 15:33:39.945941] I [syncdutils(agent):214:finalize] <top>: exiting. [2015-01-26 15:33:39.945904] I [monitor(monitor):214:monitor] Monitor: worker(/master_brick1) died before establishing connection [2015-01-26 15:33:49.959361] I [monitor(monitor):163:monitor] Monitor: ------------------------------------------------------------ [2015-01-26 15:33:49.959599] I [monitor(monitor):164:monitor] Monitor: starting gsyncd worker [2015-01-26 15:33:50.56200] D [gsyncd(agent):627:main_i] <top>: rpc_fd: '7,10,9,8' [2015-01-26 15:33:50.56809] I [changelogagent(agent):72:__init__] ChangelogAgent: Agent listining... [2015-01-26 15:33:50.58903] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF. [2015-01-26 15:33:50.59078] I [syncdutils(agent):214:finalize] <top>: exiting. [2015-01-26 15:33:50.59039] I [monitor(monitor):214:monitor] Monitor: worker(/master_brick1) died before establishing connection [2015-01-26 15:34:00.72674] I [monitor(monitor):163:monitor] Monitor: ------------------------------------------------------------ [2015-01-26 15:34:00.72926] I [monitor(monitor):164:monitor] Monitor: starting gsyncd worker [2015-01-26 15:34:00.169071] D [gsyncd(agent):627:main_i] <top>: rpc_fd: '7,10,9,8' [2015-01-26 15:34:00.169931] I [changelogagent(agent):72:__init__] ChangelogAgent: Agent listining... [2015-01-26 15:34:00.170466] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF. [2015-01-26 15:34:00.170526] I [monitor(monitor):214:monitor] Monitor: worker(/master_brick1) died before establishing connection [2015-01-26 15:34:00.170938] I [syncdutils(agent):214:finalize] <top>: exiting. [2015-01-26 15:34:10.183361] I [monitor(monitor):163:monitor] Monitor: ------------------------------------------------------------ [2015-01-26 15:34:10.183614] I [monitor(monitor):164:monitor] Monitor: starting gsyncd worker [2015-01-26 15:34:10.278914] D [monitor(monitor):217:monitor] Monitor: worker(/master_brick1) connected [2015-01-26 15:34:10.279994] I [monitor(monitor):222:monitor] Monitor: worker(/master_brick1) died in startup phase [2015-01-26 15:34:10.282217] D [gsyncd(agent):627:main_i] <top>: rpc_fd: '7,10,9,8' [2015-01-26 15:34:10.282943] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF. [2015-01-26 15:34:10.283098] I [changelogagent(agent):72:__init__] ChangelogAgent: Agent listining... [2015-01-26 15:34:10.283303] I [syncdutils(agent):214:finalize] <top>: exiting. Any idea how to solve this issue? Thank you PM ________________________________ This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150126/89b49735/attachment.html>
Janvre, Pierre-Marie (Agoda)
2015-Jan-29 07:19 UTC
[Gluster-users] Gluster 3.6.1 Geo-Replication Faulty
Hi All, No idea from your side? Nobody is using the geo-replication? I updated to version 3.6.2 and issue remain the same. Can someone help, please? Thanks PM From: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] On Behalf Of Janvre, Pierre-Marie (Agoda) Sent: Monday, 26 January, 2015 15:46 To: gluster-users at gluster.org Subject: [Gluster-users] Gluster 3.6.1 Geo-Replication Faulty Hi All, I am setting up a new Gluster environment between 2 datacenters and Geo-Replication is Faulty. Here is the setup: Datacenter A 2 nodes: -master_node1 -master_node2 1 brick per node (replica) Datacenter B 2 nodes: -slave_node1 -slave_node2 1 brick per node (replica) OS: CentOS 6.6 Gluster: glusterfs 3.6.1 built on Nov 7 2014 15:15:48 Bricks setup properly without any error. Passwordless authentication between node 1 of datacenter 1 and node 1 of datacenter 2 setup successfully. Geo-Replication setup properly as below: gluster system:: execute gsec_create gluster volume geo-replication master_volume root at slave_node1::slave_volume create push-pem I can start successfully the geo-replication: gluster volume geo-replication master_volume root at slave_node1::slave_volume start But when checking the status, I have the following: gluster volume geo-replication master_volume root at slave_node1::slave_volume status MASTER NODE MASTER VOL MASTER BRICK SLAVE STATUS CHECKPOINT STATUS CRAWL STATUS ------------------------------------------------------------------------------------------------------------------------ master_node1 master_volume /master_brick1 root at slave_node1::slave_volume faulty N/A N/A master_node2 master_volume /master_brick2 root at slave_node1::slave_volume faulty N/A N/A>From the master node 1, I run geo-replication logs in debug mode and I found the following:[2015-01-26 15:33:29.247694] D [monitor(monitor):280:distribute] <top>: master bricks: [{'host': 'master_node1', 'dir': '/master_brick1'}, {'host': 'master_node2', 'dir': '/master_brick2'}] [2015-01-26 15:33:29.248047] D [monitor(monitor):286:distribute] <top>: slave SSH gateway: root at slave_node1 [2015-01-26 15:33:29.721532] I [monitor(monitor):296:distribute] <top>: slave bricks: [{'host': 'slave_node1', 'dir': '/slave_brick1'}, {'host': 'slave_node2', 'dir': '/ slave_brick2'}] [2015-01-26 15:33:29.729722] I [monitor(monitor):316:distribute] <top>: worker specs: [('/master_brick1', 'ssh://root at slave_node2:gluster://localhost:slave_volume')] [2015-01-26 15:33:29.730287] I [monitor(monitor):109:set_state] Monitor: new state: Initializing... [2015-01-26 15:33:29.731513] I [monitor(monitor):163:monitor] Monitor: ------------------------------------------------------------ [2015-01-26 15:33:29.731647] I [monitor(monitor):164:monitor] Monitor: starting gsyncd worker [2015-01-26 15:33:29.830656] D [gsyncd(agent):627:main_i] <top>: rpc_fd: '7,10,9,8' [2015-01-26 15:33:29.831882] I [monitor(monitor):214:monitor] Monitor: worker(/master_brick1) died before establishing connection [2015-01-26 15:33:29.831476] I [changelogagent(agent):72:__init__] ChangelogAgent: Agent listining... [2015-01-26 15:33:29.832392] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF. [2015-01-26 15:33:29.832693] I [syncdutils(agent):214:finalize] <top>: exiting. [2015-01-26 15:33:29.834060] I [monitor(monitor):109:set_state] Monitor: new state: faulty [2015-01-26 15:33:39.846858] I [monitor(monitor):163:monitor] Monitor: ------------------------------------------------------------ [2015-01-26 15:33:39.847105] I [monitor(monitor):164:monitor] Monitor: starting gsyncd worker [2015-01-26 15:33:39.941967] D [gsyncd(agent):627:main_i] <top>: rpc_fd: '7,10,9,8' [2015-01-26 15:33:39.942630] I [changelogagent(agent):72:__init__] ChangelogAgent: Agent listining... [2015-01-26 15:33:39.945791] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF. [2015-01-26 15:33:39.945941] I [syncdutils(agent):214:finalize] <top>: exiting. [2015-01-26 15:33:39.945904] I [monitor(monitor):214:monitor] Monitor: worker(/master_brick1) died before establishing connection [2015-01-26 15:33:49.959361] I [monitor(monitor):163:monitor] Monitor: ------------------------------------------------------------ [2015-01-26 15:33:49.959599] I [monitor(monitor):164:monitor] Monitor: starting gsyncd worker [2015-01-26 15:33:50.56200] D [gsyncd(agent):627:main_i] <top>: rpc_fd: '7,10,9,8' [2015-01-26 15:33:50.56809] I [changelogagent(agent):72:__init__] ChangelogAgent: Agent listining... [2015-01-26 15:33:50.58903] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF. [2015-01-26 15:33:50.59078] I [syncdutils(agent):214:finalize] <top>: exiting. [2015-01-26 15:33:50.59039] I [monitor(monitor):214:monitor] Monitor: worker(/master_brick1) died before establishing connection [2015-01-26 15:34:00.72674] I [monitor(monitor):163:monitor] Monitor: ------------------------------------------------------------ [2015-01-26 15:34:00.72926] I [monitor(monitor):164:monitor] Monitor: starting gsyncd worker [2015-01-26 15:34:00.169071] D [gsyncd(agent):627:main_i] <top>: rpc_fd: '7,10,9,8' [2015-01-26 15:34:00.169931] I [changelogagent(agent):72:__init__] ChangelogAgent: Agent listining... [2015-01-26 15:34:00.170466] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF. [2015-01-26 15:34:00.170526] I [monitor(monitor):214:monitor] Monitor: worker(/master_brick1) died before establishing connection [2015-01-26 15:34:00.170938] I [syncdutils(agent):214:finalize] <top>: exiting. [2015-01-26 15:34:10.183361] I [monitor(monitor):163:monitor] Monitor: ------------------------------------------------------------ [2015-01-26 15:34:10.183614] I [monitor(monitor):164:monitor] Monitor: starting gsyncd worker [2015-01-26 15:34:10.278914] D [monitor(monitor):217:monitor] Monitor: worker(/master_brick1) connected [2015-01-26 15:34:10.279994] I [monitor(monitor):222:monitor] Monitor: worker(/master_brick1) died in startup phase [2015-01-26 15:34:10.282217] D [gsyncd(agent):627:main_i] <top>: rpc_fd: '7,10,9,8' [2015-01-26 15:34:10.282943] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF. [2015-01-26 15:34:10.283098] I [changelogagent(agent):72:__init__] ChangelogAgent: Agent listining... [2015-01-26 15:34:10.283303] I [syncdutils(agent):214:finalize] <top>: exiting. Any idea how to solve this issue? Thank you PM ________________________________ This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150129/c65790ae/attachment.html>