thr3ads.net - Gluster users - [Gluster-users] Gluster 3.6.1 Geo-Replication Faulty [Jan 2015]

If this information is useful, please help other people find it:
Share via:

Janvre, Pierre-Marie (Agoda)

2015-Jan-26 08:46 UTC

[Gluster-users] Gluster 3.6.1 Geo-Replication Faulty

Hi All,

I am setting up a new Gluster environment between 2 datacenters and
Geo-Replication is Faulty.
Here is the setup:

Datacenter A
2 nodes:
-master_node1
-master_node2
1 brick per node (replica)

Datacenter B
2 nodes:
-slave_node1
-slave_node2
1 brick per node (replica)

OS: CentOS 6.6
Gluster: glusterfs 3.6.1 built on Nov  7 2014 15:15:48

Bricks setup properly without any error.
Passwordless authentication between node 1 of datacenter 1 and node 1 of
datacenter 2 setup successfully.
Geo-Replication setup properly as  below:
gluster system:: execute gsec_create
gluster volume geo-replication master_volume root at slave_node1::slave_volume
create push-pem

I can start successfully the geo-replication:
gluster volume geo-replication master_volume root at slave_node1::slave_volume
start

But when checking the status, I have the following:
gluster volume geo-replication master_volume root at slave_node1::slave_volume
status

MASTER NODE    MASTER VOL    MASTER BRICK    SLAVE                         
STATUS    CHECKPOINT STATUS    CRAWL STATUS
------------------------------------------------------------------------------------------------------------------------
master_node1    master_volume     /master_brick1     root at
slave_node1::slave_volume    faulty    N/A                  N/A
master_node2    master_volume     /master_brick2     root at
slave_node1::slave_volume    faulty    N/A                  N/A
>From the master node 1, I run geo-replication logs in debug mode and I found
the following:[2015-01-26 15:33:29.247694] D [monitor(monitor):280:distribute] <top>:
master bricks: [{'host': 'master_node1', 'dir':
'/master_brick1'}, {'host': 'master_node2',
'dir': '/master_brick2'}]
[2015-01-26 15:33:29.248047] D [monitor(monitor):286:distribute] <top>:
slave SSH gateway: root at slave_node1
[2015-01-26 15:33:29.721532] I [monitor(monitor):296:distribute] <top>:
slave bricks: [{'host': 'slave_node1', 'dir':
'/slave_brick1'}, {'host': 'slave_node2', 'dir':
'/ slave_brick2'}]
[2015-01-26 15:33:29.729722] I [monitor(monitor):316:distribute] <top>:
worker specs: [('/master_brick1', 'ssh://root at
slave_node2:gluster://localhost:slave_volume')]
[2015-01-26 15:33:29.730287] I [monitor(monitor):109:set_state] Monitor: new
state: Initializing...
[2015-01-26 15:33:29.731513] I [monitor(monitor):163:monitor] Monitor:
------------------------------------------------------------
[2015-01-26 15:33:29.731647] I [monitor(monitor):164:monitor] Monitor: starting
gsyncd worker
[2015-01-26 15:33:29.830656] D [gsyncd(agent):627:main_i] <top>: rpc_fd:
'7,10,9,8'
[2015-01-26 15:33:29.831882] I [monitor(monitor):214:monitor] Monitor:
worker(/master_brick1) died before establishing connection
[2015-01-26 15:33:29.831476] I [changelogagent(agent):72:__init__]
ChangelogAgent: Agent listining...
[2015-01-26 15:33:29.832392] I [repce(agent):92:service_loop] RepceServer:
terminating on reaching EOF.
[2015-01-26 15:33:29.832693] I [syncdutils(agent):214:finalize] <top>:
exiting.
[2015-01-26 15:33:29.834060] I [monitor(monitor):109:set_state] Monitor: new
state: faulty
[2015-01-26 15:33:39.846858] I [monitor(monitor):163:monitor] Monitor:
------------------------------------------------------------
[2015-01-26 15:33:39.847105] I [monitor(monitor):164:monitor] Monitor: starting
gsyncd worker
[2015-01-26 15:33:39.941967] D [gsyncd(agent):627:main_i] <top>: rpc_fd:
'7,10,9,8'
[2015-01-26 15:33:39.942630] I [changelogagent(agent):72:__init__]
ChangelogAgent: Agent listining...
[2015-01-26 15:33:39.945791] I [repce(agent):92:service_loop] RepceServer:
terminating on reaching EOF.
[2015-01-26 15:33:39.945941] I [syncdutils(agent):214:finalize] <top>:
exiting.
[2015-01-26 15:33:39.945904] I [monitor(monitor):214:monitor] Monitor:
worker(/master_brick1) died before establishing connection
[2015-01-26 15:33:49.959361] I [monitor(monitor):163:monitor] Monitor:
------------------------------------------------------------
[2015-01-26 15:33:49.959599] I [monitor(monitor):164:monitor] Monitor: starting
gsyncd worker
[2015-01-26 15:33:50.56200] D [gsyncd(agent):627:main_i] <top>: rpc_fd:
'7,10,9,8'
[2015-01-26 15:33:50.56809] I [changelogagent(agent):72:__init__]
ChangelogAgent: Agent listining...
[2015-01-26 15:33:50.58903] I [repce(agent):92:service_loop] RepceServer:
terminating on reaching EOF.
[2015-01-26 15:33:50.59078] I [syncdutils(agent):214:finalize] <top>:
exiting.
[2015-01-26 15:33:50.59039] I [monitor(monitor):214:monitor] Monitor:
worker(/master_brick1) died before establishing connection
[2015-01-26 15:34:00.72674] I [monitor(monitor):163:monitor] Monitor:
------------------------------------------------------------
[2015-01-26 15:34:00.72926] I [monitor(monitor):164:monitor] Monitor: starting
gsyncd worker
[2015-01-26 15:34:00.169071] D [gsyncd(agent):627:main_i] <top>: rpc_fd:
'7,10,9,8'
[2015-01-26 15:34:00.169931] I [changelogagent(agent):72:__init__]
ChangelogAgent: Agent listining...
[2015-01-26 15:34:00.170466] I [repce(agent):92:service_loop] RepceServer:
terminating on reaching EOF.
[2015-01-26 15:34:00.170526] I [monitor(monitor):214:monitor] Monitor:
worker(/master_brick1) died before establishing connection
[2015-01-26 15:34:00.170938] I [syncdutils(agent):214:finalize] <top>:
exiting.
[2015-01-26 15:34:10.183361] I [monitor(monitor):163:monitor] Monitor:
------------------------------------------------------------
[2015-01-26 15:34:10.183614] I [monitor(monitor):164:monitor] Monitor: starting
gsyncd worker
[2015-01-26 15:34:10.278914] D [monitor(monitor):217:monitor] Monitor:
worker(/master_brick1) connected
[2015-01-26 15:34:10.279994] I [monitor(monitor):222:monitor] Monitor:
worker(/master_brick1) died in startup phase
[2015-01-26 15:34:10.282217] D [gsyncd(agent):627:main_i] <top>: rpc_fd:
'7,10,9,8'
[2015-01-26 15:34:10.282943] I [repce(agent):92:service_loop] RepceServer:
terminating on reaching EOF.
[2015-01-26 15:34:10.283098] I [changelogagent(agent):72:__init__]
ChangelogAgent: Agent listining...
[2015-01-26 15:34:10.283303] I [syncdutils(agent):214:finalize] <top>:
exiting.


Any idea how to solve this issue?

Thank you


PM

________________________________
This message is confidential and is for the sole use of the intended
recipient(s). It may also be privileged or otherwise protected by copyright or
other legal rules. If you have received it by mistake please let us know by
reply email and delete it from your system. It is prohibited to copy this
message or disclose its content to anyone. Any confidentiality or privilege is
not waived or lost by any mistaken delivery or unauthorized disclosure of the
message. All messages sent to and from Agoda may be monitored to ensure
compliance with company policies, to protect the company's interests and to
remove potential malware. Electronic messages may be intercepted, amended, lost
or deleted, or contain viruses.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150126/89b49735/attachment.html>

Janvre, Pierre-Marie (Agoda)

2015-Jan-29 07:19 UTC

head link

[Gluster-users] Gluster 3.6.1 Geo-Replication Faulty

Hi All,

No idea from your side? Nobody is using the geo-replication?
I updated to version 3.6.2 and issue remain the same.
Can someone help, please?


Thanks


PM


From: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at
gluster.org] On Behalf Of Janvre, Pierre-Marie (Agoda)
Sent: Monday, 26 January, 2015 15:46
To: gluster-users at gluster.org
Subject: [Gluster-users] Gluster 3.6.1 Geo-Replication Faulty

Hi All,

I am setting up a new Gluster environment between 2 datacenters and
Geo-Replication is Faulty.
Here is the setup:

Datacenter A
2 nodes:
-master_node1
-master_node2
1 brick per node (replica)

Datacenter B
2 nodes:
-slave_node1
-slave_node2
1 brick per node (replica)

OS: CentOS 6.6
Gluster: glusterfs 3.6.1 built on Nov  7 2014 15:15:48

Bricks setup properly without any error.
Passwordless authentication between node 1 of datacenter 1 and node 1 of
datacenter 2 setup successfully.
Geo-Replication setup properly as  below:
gluster system:: execute gsec_create
gluster volume geo-replication master_volume root at slave_node1::slave_volume
create push-pem

I can start successfully the geo-replication:
gluster volume geo-replication master_volume root at slave_node1::slave_volume
start

But when checking the status, I have the following:
gluster volume geo-replication master_volume root at slave_node1::slave_volume
status

MASTER NODE    MASTER VOL    MASTER BRICK    SLAVE                         
STATUS    CHECKPOINT STATUS    CRAWL STATUS
------------------------------------------------------------------------------------------------------------------------
master_node1    master_volume     /master_brick1     root at
slave_node1::slave_volume    faulty    N/A                  N/A
master_node2    master_volume     /master_brick2     root at
slave_node1::slave_volume    faulty    N/A                  N/A
>From the master node 1, I run geo-replication logs in debug mode and I found
the following:[2015-01-26 15:33:29.247694] D [monitor(monitor):280:distribute] <top>:
master bricks: [{'host': 'master_node1', 'dir':
'/master_brick1'}, {'host': 'master_node2',
'dir': '/master_brick2'}]
[2015-01-26 15:33:29.248047] D [monitor(monitor):286:distribute] <top>:
slave SSH gateway: root at slave_node1
[2015-01-26 15:33:29.721532] I [monitor(monitor):296:distribute] <top>:
slave bricks: [{'host': 'slave_node1', 'dir':
'/slave_brick1'}, {'host': 'slave_node2', 'dir':
'/ slave_brick2'}]
[2015-01-26 15:33:29.729722] I [monitor(monitor):316:distribute] <top>:
worker specs: [('/master_brick1', 'ssh://root at
slave_node2:gluster://localhost:slave_volume')]
[2015-01-26 15:33:29.730287] I [monitor(monitor):109:set_state] Monitor: new
state: Initializing...
[2015-01-26 15:33:29.731513] I [monitor(monitor):163:monitor] Monitor:
------------------------------------------------------------
[2015-01-26 15:33:29.731647] I [monitor(monitor):164:monitor] Monitor: starting
gsyncd worker
[2015-01-26 15:33:29.830656] D [gsyncd(agent):627:main_i] <top>: rpc_fd:
'7,10,9,8'
[2015-01-26 15:33:29.831882] I [monitor(monitor):214:monitor] Monitor:
worker(/master_brick1) died before establishing connection
[2015-01-26 15:33:29.831476] I [changelogagent(agent):72:__init__]
ChangelogAgent: Agent listining...
[2015-01-26 15:33:29.832392] I [repce(agent):92:service_loop] RepceServer:
terminating on reaching EOF.
[2015-01-26 15:33:29.832693] I [syncdutils(agent):214:finalize] <top>:
exiting.
[2015-01-26 15:33:29.834060] I [monitor(monitor):109:set_state] Monitor: new
state: faulty
[2015-01-26 15:33:39.846858] I [monitor(monitor):163:monitor] Monitor:
------------------------------------------------------------
[2015-01-26 15:33:39.847105] I [monitor(monitor):164:monitor] Monitor: starting
gsyncd worker
[2015-01-26 15:33:39.941967] D [gsyncd(agent):627:main_i] <top>: rpc_fd:
'7,10,9,8'
[2015-01-26 15:33:39.942630] I [changelogagent(agent):72:__init__]
ChangelogAgent: Agent listining...
[2015-01-26 15:33:39.945791] I [repce(agent):92:service_loop] RepceServer:
terminating on reaching EOF.
[2015-01-26 15:33:39.945941] I [syncdutils(agent):214:finalize] <top>:
exiting.
[2015-01-26 15:33:39.945904] I [monitor(monitor):214:monitor] Monitor:
worker(/master_brick1) died before establishing connection
[2015-01-26 15:33:49.959361] I [monitor(monitor):163:monitor] Monitor:
------------------------------------------------------------
[2015-01-26 15:33:49.959599] I [monitor(monitor):164:monitor] Monitor: starting
gsyncd worker
[2015-01-26 15:33:50.56200] D [gsyncd(agent):627:main_i] <top>: rpc_fd:
'7,10,9,8'
[2015-01-26 15:33:50.56809] I [changelogagent(agent):72:__init__]
ChangelogAgent: Agent listining...
[2015-01-26 15:33:50.58903] I [repce(agent):92:service_loop] RepceServer:
terminating on reaching EOF.
[2015-01-26 15:33:50.59078] I [syncdutils(agent):214:finalize] <top>:
exiting.
[2015-01-26 15:33:50.59039] I [monitor(monitor):214:monitor] Monitor:
worker(/master_brick1) died before establishing connection
[2015-01-26 15:34:00.72674] I [monitor(monitor):163:monitor] Monitor:
------------------------------------------------------------
[2015-01-26 15:34:00.72926] I [monitor(monitor):164:monitor] Monitor: starting
gsyncd worker
[2015-01-26 15:34:00.169071] D [gsyncd(agent):627:main_i] <top>: rpc_fd:
'7,10,9,8'
[2015-01-26 15:34:00.169931] I [changelogagent(agent):72:__init__]
ChangelogAgent: Agent listining...
[2015-01-26 15:34:00.170466] I [repce(agent):92:service_loop] RepceServer:
terminating on reaching EOF.
[2015-01-26 15:34:00.170526] I [monitor(monitor):214:monitor] Monitor:
worker(/master_brick1) died before establishing connection
[2015-01-26 15:34:00.170938] I [syncdutils(agent):214:finalize] <top>:
exiting.
[2015-01-26 15:34:10.183361] I [monitor(monitor):163:monitor] Monitor:
------------------------------------------------------------
[2015-01-26 15:34:10.183614] I [monitor(monitor):164:monitor] Monitor: starting
gsyncd worker
[2015-01-26 15:34:10.278914] D [monitor(monitor):217:monitor] Monitor:
worker(/master_brick1) connected
[2015-01-26 15:34:10.279994] I [monitor(monitor):222:monitor] Monitor:
worker(/master_brick1) died in startup phase
[2015-01-26 15:34:10.282217] D [gsyncd(agent):627:main_i] <top>: rpc_fd:
'7,10,9,8'
[2015-01-26 15:34:10.282943] I [repce(agent):92:service_loop] RepceServer:
terminating on reaching EOF.
[2015-01-26 15:34:10.283098] I [changelogagent(agent):72:__init__]
ChangelogAgent: Agent listining...
[2015-01-26 15:34:10.283303] I [syncdutils(agent):214:finalize] <top>:
exiting.


Any idea how to solve this issue?

Thank you


PM

________________________________
This message is confidential and is for the sole use of the intended
recipient(s). It may also be privileged or otherwise protected by copyright or
other legal rules. If you have received it by mistake please let us know by
reply email and delete it from your system. It is prohibited to copy this
message or disclose its content to anyone. Any confidentiality or privilege is
not waived or lost by any mistaken delivery or unauthorized disclosure of the
message. All messages sent to and from Agoda may be monitored to ensure
compliance with company policies, to protect the company's interests and to
remove potential malware. Electronic messages may be intercepted, amended, lost
or deleted, or contain viruses.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150129/c65790ae/attachment.html>

Gluster users - Jan 2015 - Gluster 3.6.1 Geo-Replication Faulty

[Gluster-users] Gluster 3.6.1 Geo-Replication Faulty

[Gluster-users] Gluster 3.6.1 Geo-Replication Faulty