thr3ads.net - Gluster users - [Gluster-users] Porblem creating Geo-replication [Dec 2014]

If this information is useful, please help other people find it:
Share via:

wodel youchi

2014-Dec-13 21:27 UTC

[Gluster-users] Porblem creating Geo-replication

Hi again,

I had a hard time to configure geo-replication. After creating the session, the
start gave me faulty state and I had to deal with it, it was the
/nonexistent/gsyncd in
/var/lib/glusterd/geo-replication/data1_node3.example_data2/gsyncd.conf to
change to /usr/libexec/gluster/gsyncd

after that the geo-started, but nothing happened, no file or directory was
synced.

the gluster volume geo-replication data1 geoaccount at node3.example.com::data2
status

keeps saying: Changelog Crawl

[2014-12-13 21:57:17.129836] W [master(/mnt/srv1/brick1):1005:process] _GMaster:
incomplete sync, retrying changelogs: CHANGELOG.1418504186
[2014-12-13 21:57:22.648163] W [master(/mnt/srv1/brick1):294:regjob] _GMaster:
Rsync: .gfid/f066ca4a-2d31-4342-bc7e-a37da25b2253 [errcode: 23]
[2014-12-13 21:57:22.648426] W [master(/mnt/srv1/brick1):986:process] _GMaster:
changelogs CHANGELOG.1418504186 could not be processed - moving on...

but new files/directories are synced, so I deleted all data on the master
cluster and recreated them, and all was synced
but the status command (above) keeps saying Changelog Crawl

On the slave node I have these logs

[2014-12-13 21:11:12.149041] W
[client-rpc-fops.c:1210:client3_3_removexattr_cbk] 0-data2-client-0: remote
operation failed: No data available
[2014-12-13 21:11:12.149067] W [fuse-bridge.c:1261:fuse_err_cbk]
0-glusterfs-fuse: 3243: REMOVEXATTR()
/.gfid/cd26bcc2-9b9f-455a-a2b2-9b6358f24203 => -1 (No data available)
[2014-12-13 21:11:12.516674] W
[client-rpc-fops.c:1210:client3_3_removexattr_cbk] 0-data2-client-0: remote
operation failed: No data available
[2014-12-13 21:11:12.516705] W [fuse-bridge.c:1261:fuse_err_cbk]
0-glusterfs-fuse: 3325: REMOVEXATTR()
/.gfid/e6142b37-2362-4c95-a291-f396a122b014 => -1 (No data available)
[2014-12-13 21:11:12.517577] W
[client-rpc-fops.c:1210:client3_3_removexattr_cbk] 0-data2-client-0: remote
operation failed: No data available
[2014-12-13 21:11:12.517600] W [fuse-bridge.c:1261:fuse_err_cbk]
0-glusterfs-fuse: 3331: REMOVEXATTR()
/.gfid/e6142b37-2362-4c95-a291-f396a122b014 => -1 (No data available)

and
[2014-12-13 21:57:16.741321] W [syncdutils(slave):480:errno_wrap] <top>:
reached maximum retries (['.gfid/ba9c75ef-d4f7-4a6b-923f-82a8c7be4443',
'glusterfs.gfid.newfile',
'\x00\x00\x00\x1b\x00\x00\x00\x1bcab3ae81-7b52-4c55-ac33-37814ff374c4\x00\x00\x00\x81\xb0glustercli1.lower-test\x00\x00\x00\x01\xb0\x00\x00\x00\x00\x00\x00\x00\x00'])...
[2014-12-13 21:57:22.400269] W [syncdutils(slave):480:errno_wrap] <top>:
reached maximum retries (['.gfid/ba9c75ef-d4f7-4a6b-923f-82a8c7be4443',
'glusterfs.gfid.newfile',
'\x00\x00\x00\x1b\x00\x00\x00\x1bcab3ae81-7b52-4c55-ac33-37814ff374c4\x00\x00\x00\x81\xb0glustercli1.lower-test\x00\x00\x00\x01\xb0\x00\x00\x00\x00\x00\x00\x00\x00'])...

I didn't find what does this mean

any idea.

Regards


     Le Vendredi 12 d?cembre 2014 19h35, wodel youchi <wodel_doom at
yahoo.fr> a ?crit :
   

 Thanks for your reply,
When executing the gverify.sh script I had these errors on slave.log[2014-12-12
18:12:45.423669] I [options.c:1163:xlator_option_init_double] 0-fuse: option
attribute-timeout convertion failed value 1.0
[2014-12-12 18:12:45.423689] E [xlator.c:425:xlator_init] 0-fuse: Initialization
of volume 'fuse' failed, review your volfile again

I think that the problem is linked to the locale variables, mine were
LANG=fr_FR.UTF-8
LC_CTYPE="fr_FR.UTF-8"
LC_NUMERIC="fr_FR.UTF-8"
LC_TIME="fr_FR.UTF-8"
LC_COLLATE="fr_FR.UTF-8"
LC_MONETARY="fr_FR.UTF-8"
LC_MESSAGES="fr_FR.UTF-8"
LC_PAPER="fr_FR.UTF-8"
LC_NAME="fr_FR.UTF-8"
LC_ADDRESS="fr_FR.UTF-8"
LC_TELEPHONE="fr_FR.UTF-8"
LC_MEASUREMENT="fr_FR.UTF-8"
LC_IDENTIFICATION="fr_FR.UTF-8"
LC_ALL
I changed LC_CTYPE and LC_NUMERIC to C and then executed the gverify.sh script
again and it worked, but the gluster vol geo-rep ... failed.
I then changed the /etc/locale.conf file and modified the LANG from fr_FR.UTF-8
to C, rebooted the VM and voila, the geo-replication session was created
successfuly.
but I am not sure if my changes won't affect other things.
Regards


 

     Le Vendredi 12 d?cembre 2014 8h30, Kotresh Hiremath Ravishankar
<khiremat at redhat.com> a ?crit :
   

 Hi,

The setup is failing while doing compatibility test between master and slave
cluster.
The gverify.sh script is failing to get master volume details for the same.

Could you run the following and paste the output here?

bash -x /usr/local/libexec/glusterfs/gverify.sh <master-vol-name> root
<slave-host-name> <slave-vol-name> <temp-log-file>

If source installed gverify.sh is found in above path where as if rpm install,
it is found in /usr/libexec/glusterfs/gverify.sh

If you are sure the master and slave gluster versions and size is fine, the easy
workaround
is to use force.

gluster vol geo-replication <master-vol>
<slave-host>::<slave-vol> create push-pem force



Thanks and Regards,
Kotresh H R

----- Original Message -----
From: "wodel youchi" <wodel_doom at yahoo.fr>
To: gluster-users at gluster.org
Sent: Friday, December 12, 2014 3:13:48 AM
Subject: [Gluster-users] Porblem creating Geo-replication

Hi, 
I am using Centos7x64 updates 
GlusterFS 3.6 from
http://download.gluster.org/pub/gluster/glusterfs/LATEST/CentOS/epel-7Server/
repository.
No firewall and No Selinux. 


I've two nodes with distributed replicated volume: data1 
and a third node with a distributed volume: data2 
the two volumes have the same size 

I've trouble to configure geo-replication to the third node, I've been
following the RedHat Storage 3 Admin Guide, but It does not work.

I've created the ssh-passwordless connection between the nodes, and followed
these commands

On the master: 
# gluster system: : execute gsec_create 
Common secret pub file present at
/var/lib/glusterd/geo-replication/common_secret.pem.pub

# gluster volume geo-replication data1 node3.example.com::data2 create push-pem 
Unable to fetch master volume details. Please check the master cluster and
master volume.
geo-replication command failed 

from /var/log/glusterfs/etc-glusterfs-glusterd.vol.log I get these error messges
[2014-12-11 21:34:47.152644] E [glusterd-geo-rep.c:2012:glusterd_verify_slave]
0-: Not a valid slave
[2014-12-11 21:34:47.152750] E
[glusterd-geo-rep.c:2240:glusterd_op_stage_gsync_create] 0-:
node3.example.com::data2 is not a valid slave volume. Error: Unable to fetch
master volume details. Please check the master cluster and master volume.
[2014-12-11 21:34:47.152764] E [glusterd-syncop.c:1151:gd_stage_op_phase]
0-management: Staging of operation 'Volume Geo-replication Create'
failed on localhost : Unable to fetch master volume details. Please check the
master cluster and master volume.
[2014-12-11 21:35:25.559144] E
[glusterd-handshake.c:914:gd_validate_mgmt_hndsk_req] 0-management: Rejecting
management handshake request from unknown peer 192.168.1.9:1005

the 192.168.1.9 is the IP address of the 3rd node. 

any idea!!. 

thanks 




_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


    
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

   
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141213/12cce108/attachment.html>

Aravinda

2014-Dec-15 06:10 UTC

head link

[Gluster-users] Porblem creating Geo-replication

On 12/14/2014 02:57 AM, wodel youchi wrote:> Hi again,
>
> I had a hard time to configure geo-replication. After creating the 
> session, the start gave me faulty state and I had to deal with it, it 
> was the /nonexistent/gsyncd in 
> /var/lib/glusterd/geo-replication/data1_node3.example_data2/gsyncd.conf to 
> change to /usr/libexec/gluster/gsyncdWhen you get /nonexistent/gsyncd, that means issue with pem keys setup. 
For security reason while pushing the pem keys to slave, we add 
command=<COMMAND TO RUN> before key in /root/.ssh/authorized_keys.

For example,

command="/usr/local/libexec/glusterfs/gsyncd"  ssh-rsa
AAAAB3NzaC1yc2EAAAABIwAAAQEAqxqMiZ8dyXUQq0pLVOYpRSsC+aYFn6pbPQZ3LtRPKGYfA63SNoYni
fhnM2UR9fnZz3hisBUxIzcVrVux2y3ojI/vPFFi08tVtK8/r9yoqqh3HqlBnotY50H1/1qeco+71U9hy276fUONP64KoOZtme3MwYuoNz
4z1NvCQFcEbXtPfHO5A9P3C+NuMhgNK8N63RSCzZ6dtO+wZygbVJlbPNQxp8Y5E8rbIuzRy6bD/0nmEKc/nqvEYTYgkck
ES0Xy92JVxbcwCOnZFNi4rT6+HarDIuFRB835I5ss+QBrT9SM09qmFuQ== root at fedoravm1

If you have any other entry in /root/.ssh/authorized_keys file of slave machine
with same key then you will get /nonexistent/gsyncd error.

Advantage of having "command" in authorized key is, if any master node
is compromised,
then they can login to any slave node. But with this command option, logged in
user is
only limited to use gsyncd or the command specified in authorized_keys.

>
> after that the geo-started, but nothing happened, no file or directory 
> was synced.
>
> the gluster volume geo-replication data1 
> geoaccount at node3.example.com::data2 status
>
> keeps saying: Changelog Crawl
>
> [2014-12-13 21:57:17.129836] W [master(/mnt/srv1/brick1):1005:process] 
> _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1418504186
> [2014-12-13 21:57:22.648163] W [master(/mnt/srv1/brick1):294:regjob] 
> _GMaster: Rsync: .gfid/f066ca4a-2d31-4342-bc7e-a37da25b2253 [errcode: 23]
> [2014-12-13 21:57:22.648426] W [master(/mnt/srv1/brick1):986:process] 
> _GMaster: changelogs CHANGELOG.1418504186 could not be processed - 
> moving on...
>
> but new files/directories are synced, so I deleted all data on the 
> master cluster and recreated them, and all was synced
> but the status command (above) keeps saying Changelog Crawl
>
> On the slave node I have these logs
>
> [2014-12-13 21:11:12.149041] W 
> [client-rpc-fops.c:1210:client3_3_removexattr_cbk] 0-data2-client-0: 
> remote operation failed: No data available
> [2014-12-13 21:11:12.149067] W [fuse-bridge.c:1261:fuse_err_cbk] 
> 0-glusterfs-fuse: 3243: REMOVEXATTR() 
> /.gfid/cd26bcc2-9b9f-455a-a2b2-9b6358f24203 => -1 (No data available)
> [2014-12-13 21:11:12.516674] W 
> [client-rpc-fops.c:1210:client3_3_removexattr_cbk] 0-data2-client-0: 
> remote operation failed: No data available
> [2014-12-13 21:11:12.516705] W [fuse-bridge.c:1261:fuse_err_cbk] 
> 0-glusterfs-fuse: 3325: REMOVEXATTR() 
> /.gfid/e6142b37-2362-4c95-a291-f396a122b014 => -1 (No data available)
> [2014-12-13 21:11:12.517577] W 
> [client-rpc-fops.c:1210:client3_3_removexattr_cbk] 0-data2-client-0: 
> remote operation failed: No data available
> [2014-12-13 21:11:12.517600] W [fuse-bridge.c:1261:fuse_err_cbk] 
> 0-glusterfs-fuse: 3331: REMOVEXATTR() 
> /.gfid/e6142b37-2362-4c95-a291-f396a122b014 => -1 (No data available)
>
> and
> [2014-12-13 21:57:16.741321] W [syncdutils(slave):480:errno_wrap] 
> <top>: reached maximum retries 
> (['.gfid/ba9c75ef-d4f7-4a6b-923f-82a8c7be4443', 
> 'glusterfs.gfid.newfile', 
>
'\x00\x00\x00\x1b\x00\x00\x00\x1bcab3ae81-7b52-4c55-ac33-37814ff374c4\x00\x00\x00\x81\xb0glustercli1.lower-test\x00\x00\x00\x01\xb0\x00\x00\x00\x00\x00\x00\x00\x00'])...
> [2014-12-13 21:57:22.400269] W [syncdutils(slave):480:errno_wrap] 
> <top>: reached maximum retries 
> (['.gfid/ba9c75ef-d4f7-4a6b-923f-82a8c7be4443', 
> 'glusterfs.gfid.newfile', 
>
'\x00\x00\x00\x1b\x00\x00\x00\x1bcab3ae81-7b52-4c55-ac33-37814ff374c4\x00\x00\x00\x81\xb0glustercli1.lower-test\x00\x00\x00\x01\xb0\x00\x00\x00\x00\x00\x00\x00\x00'])...
>
> I didn't find what does this meanDoes your slave had data before Geo-replication session is created. From 
the log what I can see is, geo-rep is failing to create a file in 
slave(may be same file name exists with different GFID(GFID is GlusterFS 
unique identifier for file))
Rsync is failing since file is not created in slave, and it is unable to 
sync.

--
regards
Aravinda
> any idea.
>
> Regards
>
>
> Le Vendredi 12 d?cembre 2014 19h35, wodel youchi <wodel_doom at
yahoo.fr>
> a ?crit :
>
>
> Thanks for your reply,
>
> When executing the gverify.sh script I had these errors on slave.log
> [2014-12-12 18:12:45.423669] I 
> [options.c:1163:xlator_option_init_double] 0-fuse: option 
> attribute-timeout convertion failed value 1.0
> [2014-12-12 18:12:45.423689] E [xlator.c:425:xlator_init] 0-fuse: 
> Initialization of volume 'fuse' failed, review your volfile again
>
> I think that the problem is linked to the locale variables, mine were
>
> LANG=fr_FR.UTF-8
> LC_CTYPE="fr_FR.UTF-8"
> LC_NUMERIC="fr_FR.UTF-8"
> LC_TIME="fr_FR.UTF-8"
> LC_COLLATE="fr_FR.UTF-8"
> LC_MONETARY="fr_FR.UTF-8"
> LC_MESSAGES="fr_FR.UTF-8"
> LC_PAPER="fr_FR.UTF-8"
> LC_NAME="fr_FR.UTF-8"
> LC_ADDRESS="fr_FR.UTF-8"
> LC_TELEPHONE="fr_FR.UTF-8"
> LC_MEASUREMENT="fr_FR.UTF-8"
> LC_IDENTIFICATION="fr_FR.UTF-8"
> LC_ALL>
> I changed LC_CTYPE and LC_NUMERIC to C and then executed the 
> gverify.sh script again and it worked, but the gluster vol geo-rep ... 
> failed.
>
> I then changed the /etc/locale.conf file and modified the LANG from 
> fr_FR.UTF-8 to C, rebooted the VM and voila, the geo-replication 
> session was created successfuly.
>
> but I am not sure if my changes won't affect other things.
>
> Regards
>
>
>
>
> Le Vendredi 12 d?cembre 2014 8h30, Kotresh Hiremath Ravishankar 
> <khiremat at redhat.com> a ?crit :
>
>
> Hi,
>
> The setup is failing while doing compatibility test between master and 
> slave cluster.
> The gverify.sh script is failing to get master volume details for the 
> same.
>
> Could you run the following and paste the output here?
>
> bash -x /usr/local/libexec/glusterfs/gverify.sh <master-vol-name>
root
> <slave-host-name> <slave-vol-name> <temp-log-file>
>
> If source installed gverify.sh is found in above path where as if rpm 
> install,
> it is found in /usr/libexec/glusterfs/gverify.sh
>
> If you are sure the master and slave gluster versions and size is 
> fine, the easy workaround
> is to use force.
>
> gluster vol geo-replication <master-vol>
<slave-host>::<slave-vol>
> create push-pem force
>
>
>
> Thanks and Regards,
> Kotresh H R
>
> ----- Original Message -----
> From: "wodel youchi" <wodel_doom at yahoo.fr
<mailto:wodel_doom at yahoo.fr>>
> To: gluster-users at gluster.org <mailto:gluster-users at
gluster.org>
> Sent: Friday, December 12, 2014 3:13:48 AM
> Subject: [Gluster-users] Porblem creating Geo-replication
>
> Hi,
> I am using Centos7x64 updates
> GlusterFS 3.6 from 
>
http://download.gluster.org/pub/gluster/glusterfs/LATEST/CentOS/epel-7Server/
> repository.
> No firewall and No Selinux.
>
>
> I've two nodes with distributed replicated volume: data1
> and a third node with a distributed volume: data2
> the two volumes have the same size
>
> I've trouble to configure geo-replication to the third node, I've
been
> following the RedHat Storage 3 Admin Guide, but It does not work.
>
> I've created the ssh-passwordless connection between the nodes, and 
> followed these commands
>
> On the master:
> # gluster system: : execute gsec_create
> Common secret pub file present at 
> /var/lib/glusterd/geo-replication/common_secret.pem.pub
>
> # gluster volume geo-replication data1 node3.example.com::data2 create 
> push-pem
> Unable to fetch master volume details. Please check the master cluster 
> and master volume.
> geo-replication command failed
>
> from /var/log/glusterfs/etc-glusterfs-glusterd.vol.log I get these 
> error messges
> [2014-12-11 21:34:47.152644] E 
> [glusterd-geo-rep.c:2012:glusterd_verify_slave] 0-: Not a valid slave
> [2014-12-11 21:34:47.152750] E 
> [glusterd-geo-rep.c:2240:glusterd_op_stage_gsync_create] 0-: 
> node3.example.com::data2 is not a valid slave volume. Error: Unable to 
> fetch master volume details. Please check the master cluster and 
> master volume.
> [2014-12-11 21:34:47.152764] E 
> [glusterd-syncop.c:1151:gd_stage_op_phase] 0-management: Staging of 
> operation 'Volume Geo-replication Create' failed on localhost :
Unable
> to fetch master volume details. Please check the master cluster and 
> master volume.
> [2014-12-11 21:35:25.559144] E 
> [glusterd-handshake.c:914:gd_validate_mgmt_hndsk_req] 0-management: 
> Rejecting management handshake request from unknown peer 192.168.1.9:1005
>
> the 192.168.1.9 is the IP address of the 3rd node.
>
> any idea!!.
>
> thanks
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141215/e6ed1787/attachment.html>

Gluster users - Dec 2014 - Porblem creating Geo-replication

[Gluster-users] Porblem creating Geo-replication

[Gluster-users] Porblem creating Geo-replication