thr3ads.net - Gluster users - [Gluster-users] [Centos7x64] Geo-replication problem glusterfs 3.7.0-2 [May 2015]

If this information is useful, please help other people find it:
Share via:

wodel youchi

2015-May-25 12:25 UTC

[Gluster-users] [Centos7x64] Geo-replication problem glusterfs 3.7.0-2

Hi, and thanks for your replies.

For Kotresh : No, I am not using tar ssh for my geo-replication.

For Aravinda: I had to recreate my slave volume all over et restart the
geo-replication.

If I have thousands of files with this problem, do I have to execute the
fix for all of them? is there an easy way?
Can checkpoints help me in this situation?
and more important, what can cause this problem?

I am syncing containers, they contain lot of files small files, using tar
ssh, would it be more suitable?


PS: I tried to execute this command on the Master

bash generate-gfid-file.sh localhost:data2   $PWD/get-gfid.sh
/tmp/master_gfid_file.txt

but I got errors with files that have blank (space) in their names,
for example: Admin Guide.pdf

the script sees two files Admin and Guide.pdf, then the get-gfid.sh
returns errors "no such file or directory"

thanks.


2015-05-25 7:00 GMT+01:00 Aravinda <avishwan at redhat.com>:
> Looks like this is GFID conflict issue not the tarssh issue.
>
> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
> 'e529a399-756d-4cb1-9779-0af2822a0d94', 'gid': 0,
'mode': 33152, 'entry':
> '.gfid/874799ef-df75-437b-bc8f-3fcd58b54789/main.mdb',
'op': 'CREATE'}, 2)
>
>     Data: {'uid': 0,
>            'gfid': 'e529a399-756d-4cb1-9779-0af2822a0d94',
>            'gid': 0,
>            'mode': 33152,
>            'entry':
'.gfid/874799ef-df75-437b-bc8f-3fcd58b54789/main.mdb',
>            'op': 'CREATE'}
>
>     and Error: 2
>
> During creation of "main.mdb" RPC failed with error number 2, ie,
ENOENT.
> This error comes when parent directory not exists or exists with different
> GFID.
> In this case Parent GFID "874799ef-df75-437b-bc8f-3fcd58b54789"
does not
> exists on slave.
>
>
> To fix the issue,
> -----------------
> Find the parent directory of "main.mdb",
> Get the GFID of that directory, using getfattr
> Check the GFID of the same directory in Slave(To confirm GFIDs are
> different)
> To fix the issue, Delete that directory in Slave.
> Set virtual xattr for that directory and all the files inside that
> directory.
>     setfattr -n glusterfs.geo-rep.trigger-sync -v "1" <DIR>
>     setfattr -n glusterfs.geo-rep.trigger-sync -v "1"
<file-path>
>
>
> Geo-rep will recreate the directory with Proper GFID and starts sync.
>
> Let us know if you need any help.
>
> --
> regards
> Aravinda
>
>
>
>
> On 05/25/2015 10:54 AM, Kotresh Hiremath Ravishankar wrote:
>
>> Hi Wodel,
>>
>> Is the sync mode, tar over ssh (i.e., config use_tarssh is true) ?
>> If yes, there is known issue with it and patch is already up in master.
>>
>> But it can be resolved in either of the two ways.
>>
>> 1. If sync mode required is tar over ssh, just disable sync_xattrs
which
>> is true
>>     by default.
>>
>>      gluster vol geo-rep <master-vol>
<slave-host>::<slave-vol> config
>> sync_xattrs false
>>
>> 2. If sync mode is ok to be changed to rsync. Please do.
>>           gluster vol geo-rep <master-vol>
<slave-host>::<slave-vol>
>> use_tarssh false
>>
>> NOTE: rsync supports syncing of acls and xattrs where as tar over ssh
>> does not.
>>        In 3.7.0-2, tar over ssh should be used with sync_xattrs to
false
>>
>> Hope this helps.
>>
>> Thanks and Regards,
>> Kotresh H R
>>
>> ----- Original Message -----
>>
>>> From: "wodel youchi" <wodel.youchi at gmail.com>
>>> To: "gluster-users" <gluster-users at gluster.org>
>>> Sent: Sunday, May 24, 2015 3:31:38 AM
>>> Subject: [Gluster-users] [Centos7x64] Geo-replication problem
glusterfs
>>> 3.7.0-2
>>>
>>> Hi,
>>>
>>> I have two gluster servers in replicated mode as MASTERS
>>> and one server for replicated geo-replication.
>>>
>>> I've updated my glusterfs installation to 3.7.0-2, all three
servers
>>>
>>> I've recreated my slave volumes
>>> I've started the geo-replication, it worked for a while and now
I have
>>> some
>>> problmes
>>>
>>> 1- Files/directories are not deleted on slave
>>> 2- New files/rectories are not synced to the slave.
>>>
>>> I have these lines on the active master
>>>
>>> [2015-05-23 06:21:17.156939] W
>>> [master(/mnt/brick2/brick):792:log_failures]
>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>> 'e529a399-756d-4cb1-9779-0af2822a0d94', 'gid': 0,
'mode': 33152, 'entry':
>>> '.gfid/874799ef-df75-437b-bc8f-3fcd58b54789/main.mdb',
'op': 'CREATE'},
>>> 2)
>>> [2015-05-23 06:21:17.158066] W
>>> [master(/mnt/brick2/brick):792:log_failures]
>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>> 'b4bffa4c-2e88-4b60-9f6a-c665c4d9f7ed', 'gid': 0,
'mode': 33152, 'entry':
>>> '.gfid/874799ef-df75-437b-bc8f-3fcd58b54789/main.hdb',
'op': 'CREATE'},
>>> 2)
>>> [2015-05-23 06:21:17.159154] W
>>> [master(/mnt/brick2/brick):792:log_failures]
>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>> '9920cdee-6b87-4408-834b-4389f5d451fe', 'gid': 0,
'mode': 33152, 'entry':
>>> '.gfid/874799ef-df75-437b-bc8f-3fcd58b54789/main.db',
'op': 'CREATE'}, 2)
>>> [2015-05-23 06:21:17.160242] W
>>> [master(/mnt/brick2/brick):792:log_failures]
>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>> '307756d2-d924-456f-b090-10d3ff9caccb', 'gid': 0,
'mode': 33152, 'entry':
>>> '.gfid/874799ef-df75-437b-bc8f-3fcd58b54789/main.ndb',
'op': 'CREATE'},
>>> 2)
>>> [2015-05-23 06:21:17.161283] W
>>> [master(/mnt/brick2/brick):792:log_failures]
>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>> '69ebb4cb-1157-434b-a6e9-386bea81fc1d', 'gid': 0,
'mode': 33152, 'entry':
>>> '.gfid/874799ef-df75-437b-bc8f-3fcd58b54789/COPYING',
'op': 'CREATE'}, 2)
>>> [2015-05-23 06:21:17.162368] W
>>> [master(/mnt/brick2/brick):792:log_failures]
>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>> '7d132fda-fc82-4ad8-8b6c-66009999650c', 'gid': 0,
'mode': 33152, 'entry':
>>> '.gfid/f6f2582e-0c5c-4cba-943a-6d5f64baf340/daily.cld',
'op': 'CREATE'},
>>> 2)
>>> [2015-05-23 06:21:17.163718] W
>>> [master(/mnt/brick2/brick):792:log_failures]
>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>> 'd8a0303e-ba45-4e45-a8fd-17994c34687b', 'gid': 0,
'mode': 16832, 'entry':
>>>
>>>
'.gfid/f6f2582e-0c5c-4cba-943a-6d5f64baf340/clamav-54acc14b44e696e1cfb4a75ecc395fe0',
>>> 'op': 'MKDIR'}, 2)
>>> [2015-05-23 06:21:17.165102] W
>>> [master(/mnt/brick2/brick):792:log_failures]
>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>> '49d42bf6-3146-42bd-bc29-e704927d6133', 'gid': 0,
'mode': 16832, 'entry':
>>>
>>>
'.gfid/f6f2582e-0c5c-4cba-943a-6d5f64baf340/clamav-debec3aa6afe64bffaee8d099e76f3d4',
>>> 'op': 'MKDIR'}, 2)
>>> [2015-05-23 06:21:17.166147] W
>>> [master(/mnt/brick2/brick):792:log_failures]
>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>> '1ddb93ae-3717-4347-910f-607afa67cdb0', 'gid': 0,
'mode': 33152, 'entry':
>>>
>>>
'.gfid/49d42bf6-3146-42bd-bc29-e704927d6133/clamav-704a1e9a3e2c97ccac127632d7c6b8e4',
>>> 'op': 'CREATE'}, 2)
>>>
>>>
>>> in the slave lot of lines like this
>>>
>>> [2015-05-22 07:53:57.071999] W [fuse-bridge.c:1970:fuse_create_cbk]
>>> 0-glusterfs-fuse: 25833:
/.gfid/03a5a40b-c521-47ac-a4e3-916a6df42689 =>
>>> -1
>>> (Operation not permitted)
>>>
>>>
>>> in the active master I have 3.7 GB of XSYNC-CHANGELOG.xxxxxxx files
in
>>>
>>>
/var/lib/misc/glusterfsd/data2/ssh%3A%2F%2Froot%4010.10.10.10%3Agluster%3A%2F%2F127.0.0.1%3Aslavedata2/e55761a256af4acfe9b4a419be62462a/xsync
>>>
>>> I don't know if this is normal.
>>>
>>> any idea?
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150525/b827e47e/attachment.html>

wodel youchi

2015-May-26 12:28 UTC

head link

[Gluster-users] [Centos7x64] Geo-replication problem glusterfs 3.7.0-2

Hi again,

As I mentioned earlier, I had to recreate my slave volume et restart the
geo-replication again.

and as usual, the geo-replication went well at the beginning, but after
restoring another container on MASTERS, we start getting these errors :

On Master:
[2015-05-26 11:56:04.858262] I [monitor(monitor):222:monitor] Monitor:
starting gsyncd worker
[2015-05-26 11:56:04.966274] I [gsyncd(/mnt/brick2/brick):649:main_i]
<top>: syncing: gluster://localhost:data2 -> ssh://root at gserver3
:gluster://localhost:slavedata2
[2015-05-26 11:56:04.967361] I [changelogagent(agent):75:__init__]
ChangelogAgent: Agent listining...
[2015-05-26 11:56:07.473591] I
[master(/mnt/brick2/brick):83:gmaster_builder] <top>: setting up xsync
change detection mode
[2015-05-26 11:56:07.474025] I [master(/mnt/brick2/brick):404:__init__]
_GMaster: using 'rsync' as the sync engine
[2015-05-26 11:56:07.475222] I
[master(/mnt/brick2/brick):83:gmaster_builder] <top>: setting up changelog
change detection mode
[2015-05-26 11:56:07.475511] I [master(/mnt/brick2/brick):404:__init__]
_GMaster: using 'rsync' as the sync engine
[2015-05-26 11:56:07.476761] I
[master(/mnt/brick2/brick):83:gmaster_builder] <top>: setting up
changeloghistory change detection mode
[2015-05-26 11:56:07.477065] I [master(/mnt/brick2/brick):404:__init__]
_GMaster: using 'rsync' as the sync engine
[2015-05-26 11:56:09.528716] I [master(/mnt/brick2/brick):1197:register]
_GMaster: xsync temp directory:
/var/lib/misc/glusterfsd/data2/ssh%3A%2F%2Froot%4010.10.10.10%3Agluster%3A%2F%2F127.0.0.1%3Aslavedata2/e55761a256af4acfe9b4a419be62462a/xsync
[2015-05-26 11:56:09.529055] I
[resource(/mnt/brick2/brick):1434:service_loop] GLUSTER: Register time:
1432637769
[2015-05-26 11:56:09.545244] I [master(/mnt/brick2/brick):519:crawlwrap]
_GMaster: primary master with volume id
107c9baa-f734-4926-8e7e-c60e3107284f ...
[2015-05-26 11:56:09.567487] I [master(/mnt/brick2/brick):528:crawlwrap]
_GMaster: crawl interval: 1 seconds
[2015-05-26 11:56:09.585380] I [master(/mnt/brick2/brick):1112:crawl]
_GMaster: starting history crawl... turns: 1, stime: (1432580690, 0)
[2015-05-26 11:56:10.591133] I [master(/mnt/brick2/brick):1141:crawl]
_GMaster: slave's time: (1432580690, 0)
[2015-05-26 11:56:16.564407] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/9f0887da-2243-470d-be92-49a6d85acf5d', 'stat':
{'atime':
1432589079.955492, 'gid': 0, 'mtime': 1362693065.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:16.565541] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/1076aea5-6875-494f-a276-6268e443d86e', 'stat':
{'atime':
1432589080.1354961, 'gid': 0, 'mtime': 1372762987.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:16.566585] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/2b449e9b-e9a7-4371-9e1b-de5d9e2407a0', 'stat':
{'atime':
1432589080.0714946, 'gid': 0, 'mtime': 1372762987.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:16.567661] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/5c10f0cd-0ffa-41b6-b056-89d5f2ea7c9b', 'stat':
{'atime':
1432589080.001493, 'gid': 0, 'mtime': 1372762987.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:16.568644] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/22b9e1b0-8f8e-4a17-a02f-e9f4a31e65b8', 'stat':
{'atime':
1432589080.0674946, 'gid': 0, 'mtime': 1362693065.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:16.569616] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/0600d002-78dd-49e9-ab26-ee1f3ec81293', 'stat':
{'atime':
1432589079.9294913, 'gid': 0, 'mtime': 1372762987.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:16.570667] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/8dd195ec-3698-45f6-82e4-2679a1731019', 'stat':
{'atime':
1432589079.9764924, 'gid': 0, 'mtime': 1372762987.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:16.571583] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/13f2c030-7483-4924-bc0e-c12d97c65ed6', 'stat':
{'atime':
1432589079.9794924, 'gid': 0, 'mtime': 1372762987.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:16.572529] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/6e23fedf-6b83-4f49-94f2-49d150dba857', 'stat':
{'atime':
1432589080.0784948, 'gid': 0, 'mtime': 1362693065.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:16.573537] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/1b1695d7-0958-4db6-8dd8-917950fadd27', 'stat':
{'atime':
1432589079.9414916, 'gid': 0, 'mtime': 1378284454.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:16.574553] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/c3795ae6-6e73-4b46-8aa2-fe296b927a42', 'stat':
{'atime':
1432589080.0514941, 'gid': 0, 'mtime': 1362693065.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:16.575500] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/5544e740-fc67-42cd-9672-9d9fe2ad119f', 'stat':
{'atime':
1432589080.0394938, 'gid': 0, 'mtime': 1372762987.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:16.576426] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/54d85a75-1d57-4a4c-b144-1aa70f52f88c', 'stat':
{'atime':
1432589080.0164933, 'gid': 0, 'mtime': 1362693065.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:16.577302] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/46435d6d-02d1-40a4-8018-84d60f15c793', 'stat':
{'atime':
1432589079.964492, 'gid': 0, 'mtime': 1372762987.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:16.578196] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/1b16ad0b-0107-48e7-adac-2ee450c11181', 'stat':
{'atime':
1432589079.9734924, 'gid': 0, 'mtime': 1403054465.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:16.579090] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/15b8f710-1467-47f4-891c-911fe4a6f66e', 'stat':
{'atime':
1432589080.1074955, 'gid': 0, 'mtime': 1362693065.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:16.579996] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/97f115e6-8403-491b-9ec6-bf8e645f69ec', 'stat':
{'atime':
1432589079.9704924, 'gid': 0, 'mtime': 1372762987.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:16.580945] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/894d48f8-1977-4d44-9e3f-31711ddf2432', 'stat':
{'atime':
1432589079.9274912, 'gid': 0, 'mtime': 1372762987.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:16.581921] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/6c6190db-d2ed-48d9-8904-4e555b6650ab', 'stat':
{'atime':
1432589080.0134933, 'gid': 0, 'mtime': 1372762987.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:16.582889] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/d1597e70-cf34-4516-92f8-8fd5f05f59b5', 'stat':
{'atime':
1432589080.1234958, 'gid': 0, 'mtime': 1372762987.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:16.583786] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/b107565a-f6a5-4eee-89a6-acf6715b1d18', 'stat':
{'atime':
1432589079.9514918, 'gid': 0, 'mtime': 1372762987.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.42256] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/b51de310-36a9-4ad6-8595-f2a7e08610fb', 'stat':
{'atime':
1432589161.3073761, 'gid': 0, 'mtime': 1372763052.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.42618] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/3adc75a4-d293-4311-8d6d-00113797bb91', 'stat':
{'atime':
1432589161.2773755, 'gid': 0, 'mtime': 1372763050.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.42836] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/881cfdd0-7a68-4678-ab86-9b301425ba1f', 'stat':
{'atime':
1432589161.217374, 'gid': 0, 'mtime': 1372763054.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.43070] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/33004685-604c-4049-8fbf-7a4226a0ff68', 'stat':
{'atime':
1432589161.215374, 'gid': 0, 'mtime': 1368045650.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.43327] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/b4b08a51-48c7-47ea-b980-8da5b96599d2', 'stat':
{'atime':
1432589161.1853733, 'gid': 0, 'mtime': 1368045650.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.43549] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/6126413e-1526-4e33-b5be-2556c8c6a8cf', 'stat':
{'atime':
1432589161.2253742, 'gid': 0, 'mtime': 1372763054.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.43762] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/cccb2dda-b88c-4d5a-9d73-09a113a1d6e8', 'stat':
{'atime':
1432589161.2923758, 'gid': 0, 'mtime': 1372763054.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.44001] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/854ddd73-a95c-4207-a40c-30b8df301940', 'stat':
{'atime':
1432589161.2643752, 'gid': 0, 'mtime': 1403054465.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.44230] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/e111377d-4b65-42af-b10c-0db0a93077ca', 'stat':
{'atime':
1432589161.261375, 'gid': 0, 'mtime': 1371576397.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.44464] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/9898a879-30aa-450c-ba50-b046a706e8b8', 'stat':
{'atime':
1432589161.3673775, 'gid': 0, 'mtime': 1372763054.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.44673] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/8918c4c9-83de-4a57-b26a-3cca1ccc9ad2', 'stat':
{'atime':
1432589161.3623774, 'gid': 0, 'mtime': 1372763051.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.44924] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/5b852034-822f-493d-b524-a08c1e93d095', 'stat':
{'atime':
1432589161.2533748, 'gid': 0, 'mtime': 1371576397.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.45156] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/382f67fc-b737-40c9-bd5d-8a8d52e3dd13', 'stat':
{'atime':
1432589161.299376, 'gid': 0, 'mtime': 1372763053.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.45367] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/b650307f-9500-4ed8-9e6c-016317fdf203', 'stat':
{'atime':
1432589161.3713777, 'gid': 0, 'mtime': 1372763051.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.45598] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/7fef67e7-d558-44c0-9e33-16609fae88bc', 'stat':
{'atime':
1432589161.1833732, 'gid': 0, 'mtime': 1372763051.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.45835] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/6d80a054-acc5-420b-a5e0-6b6c2166ac08', 'stat':
{'atime':
1432589161.3303766, 'gid': 0, 'mtime': 1397764212.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.46082] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/dfda8d4e-dbf2-4c0a-ad3f-14a1923187fb', 'stat':
{'atime':
1432589161.3653774, 'gid': 0, 'mtime': 1368045650.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.46308] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/213965ed-7e02-41aa-a827-0dad01b34a78', 'stat':
{'atime':
1432589161.395378, 'gid': 0, 'mtime': 1371576397.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.46533] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/335a3c61-8792-44f3-bb84-a9e63bd50fe3', 'stat':
{'atime':
1432589161.3103762, 'gid': 0, 'mtime': 1368045650.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.46752] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/5c40a3c9-687b-4651-ae5e-c8289531bf13', 'stat':
{'atime':
1432589161.393378, 'gid': 0, 'mtime': 1379638431.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.46999] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/15711707-e31d-4c64-adb5-08504ae59a2b', 'stat':
{'atime':
1432589161.172373, 'gid': 0, 'mtime': 1372763051.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.47262] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/c7c7514f-5fb8-4dc3-aff1-0c30d4815819', 'stat':
{'atime':
1432589161.345377, 'gid': 0, 'mtime': 1372763051.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.47473] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/084f9c2d-40ca-4fe3-9e78-1bff2ecc7716', 'stat':
{'atime':
1432589161.3593774, 'gid': 0, 'mtime': 1372763049.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.47693] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/9651d035-0260-41e2-97a2-9fb2b51ef0c9', 'stat':
{'atime':
1432589161.3213766, 'gid': 0, 'mtime': 1368045650.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.47950] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/c6c49cdf-0b2b-4fa4-b2b9-398ebf3c589c', 'stat':
{'atime':
1432589161.347377, 'gid': 0, 'mtime': 1372763053.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.48182] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/f6367f6f-3e19-4fa1-bbea-9a96c11d8bbc', 'stat':
{'atime':
1432589161.1883733, 'gid': 0, 'mtime': 1372763053.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:19.48405] W [master(/mnt/brick2/brick):792:log_failures]
_GMaster: META FAILED: ({'go':
'.gfid/eeacb5af-8c99-4bfc-8495-9c84b119f9c7', 'stat':
{'atime':
1432589161.2343745, 'gid': 0, 'mtime': 1412981693.0,
'mode': 41471, 'uid':
0}, 'op': 'META'}, 2)
[2015-05-26 11:56:20.410108] E [repce(/mnt/brick2/brick):207:__call__]
RepceClient: call 8099:140141675022144:1432637780.1 (meta_ops) failed on
peer with OSError
[2015-05-26 11:56:20.410460] E
[syncdutils(/mnt/brick2/brick):276:log_raise_exception] <top>: FAIL:
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 165,
in
main
    main_i()
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 659,
in
main_i
    local.service_loop(*[r for r in [remote] if r])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line
1440,
in service_loop
    g3.crawlwrap(oneshot=True)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 580,
in
crawlwrap
    self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line
1150, in
crawl
    self.changelogs_batch_process(changes)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line
1059, in
changelogs_batch_process
    self.process(batch)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 946,
in
process
    self.process_change(change, done, retry)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 920,
in
process_change
    failures = self.slave.server.meta_ops(meta_entries)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 226,
in
__call__
    return self.ins(self.meth, *a)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 208,
in
__call__
    raise res
OSError: [Errno 95] Operation not supported:
'.gfid/d7f761f2-1dc5-4aef-bf3f-29d5de823fb0'
[2015-05-26 11:56:20.412513] I [syncdutils(/mnt/brick2/brick):220:finalize]
<top>: exiting.
[2015-05-26 11:56:20.419653] I [repce(agent):92:service_loop] RepceServer:
terminating on reaching EOF.
[2015-05-26 11:56:20.420038] I [syncdutils(agent):220:finalize] <top>:
exiting.
[2015-05-26 11:56:20.487646] I [monitor(monitor):282:monitor] Monitor:
worker(/mnt/brick2/brick) died in startup phase



On slave:
[2015-05-26 11:56:05.336785] I [gsyncd(slave):649:main_i] <top>: syncing:
gluster://localhost:slavedata2
[2015-05-26 11:56:06.371880] I [resource(slave):842:service_loop] GLUSTER:
slave listening
[2015-05-26 11:56:20.386070] E [repce(slave):117:worker] <top>: call
failed:
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113,
in
worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line
745, in
meta_ops
    [ENOENT], [ESTALE, EINVAL])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line
475,
in errno_wrap
    return call(*arg)
OSError: [Errno 95] Operation not supported:
'.gfid/d7f761f2-1dc5-4aef-bf3f-29d5de823fb0'
[2015-05-26 11:56:20.397442] I [repce(slave):92:service_loop] RepceServer:
terminating on reaching EOF.
[2015-05-26 11:56:20.397603] I [syncdutils(slave):220:finalize] <top>:
exiting.
[2015-05-26 11:56:30.827872] I [repce(slave):92:service_loop] RepceServer:
terminating on reaching EOF.
[2015-05-26 11:56:31.25315] I [syncdutils(slave):220:finalize] <top>:
exiting.


the state of the replication is Active
I searched about synchronization incomplete and I found this
http://www.gluster.org/community/documentation/index.php/Gluster_3.2:_Troubleshooting_Geo-replication

Synchronization is not complete

Description: GlusterFS Geo-replication did not synchronize the data
completely but still the geo-replication status display OK.

Solution: You can enforce a full sync of the data by erasing the index and
restarting GlusterFS Geo-replication. After restarting, GlusterFS
Geo-replication begins synchronizing all the data, that is, all files will
be compared with by means of being checksummed, which can be a lengthy
/resource high utilization operation, mainly on large data sets (however,
actual data loss will not occur). If the error situation persists, contact
Gluster Support.

For more information about erasing index, see Tuning Volume Options.

But there no mention about how to erase the index, the only option I found
is : geo-replication.indexing
is that it?

if yes, after disabling it, will the geo-replication verify all files on
slave?
when do I have to re-enable it again?

thanks

2015-05-25 13:25 GMT+01:00 wodel youchi <wodel.youchi at gmail.com>:
> Hi, and thanks for your replies.
>
> For Kotresh : No, I am not using tar ssh for my geo-replication.
>
> For Aravinda: I had to recreate my slave volume all over et restart the
> geo-replication.
>
> If I have thousands of files with this problem, do I have to execute the
> fix for all of them? is there an easy way?
> Can checkpoints help me in this situation?
> and more important, what can cause this problem?
>
> I am syncing containers, they contain lot of files small files, using tar
> ssh, would it be more suitable?
>
>
> PS: I tried to execute this command on the Master
>
> bash generate-gfid-file.sh localhost:data2   $PWD/get-gfid.sh   
/tmp/master_gfid_file.txt
>
> but I got errors with files that have blank (space) in their names, for
example: Admin Guide.pdf
>
> the script sees two files Admin and Guide.pdf, then the get-gfid.sh returns
errors "no such file or directory"
>
> thanks.
>
>
> 2015-05-25 7:00 GMT+01:00 Aravinda <avishwan at redhat.com>:
>
>> Looks like this is GFID conflict issue not the tarssh issue.
>>
>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>> 'e529a399-756d-4cb1-9779-0af2822a0d94', 'gid': 0,
'mode': 33152, 'entry':
>> '.gfid/874799ef-df75-437b-bc8f-3fcd58b54789/main.mdb',
'op': 'CREATE'}, 2)
>>
>>     Data: {'uid': 0,
>>            'gfid':
'e529a399-756d-4cb1-9779-0af2822a0d94',
>>            'gid': 0,
>>            'mode': 33152,
>>            'entry':
'.gfid/874799ef-df75-437b-bc8f-3fcd58b54789/main.mdb',
>>            'op': 'CREATE'}
>>
>>     and Error: 2
>>
>> During creation of "main.mdb" RPC failed with error number 2,
ie, ENOENT.
>> This error comes when parent directory not exists or exists with
different
>> GFID.
>> In this case Parent GFID
"874799ef-df75-437b-bc8f-3fcd58b54789" does not
>> exists on slave.
>>
>>
>> To fix the issue,
>> -----------------
>> Find the parent directory of "main.mdb",
>> Get the GFID of that directory, using getfattr
>> Check the GFID of the same directory in Slave(To confirm GFIDs are
>> different)
>> To fix the issue, Delete that directory in Slave.
>> Set virtual xattr for that directory and all the files inside that
>> directory.
>>     setfattr -n glusterfs.geo-rep.trigger-sync -v "1"
<DIR>
>>     setfattr -n glusterfs.geo-rep.trigger-sync -v "1"
<file-path>
>>
>>
>> Geo-rep will recreate the directory with Proper GFID and starts sync.
>>
>> Let us know if you need any help.
>>
>> --
>> regards
>> Aravinda
>>
>>
>>
>>
>> On 05/25/2015 10:54 AM, Kotresh Hiremath Ravishankar wrote:
>>
>>> Hi Wodel,
>>>
>>> Is the sync mode, tar over ssh (i.e., config use_tarssh is true) ?
>>> If yes, there is known issue with it and patch is already up in
master.
>>>
>>> But it can be resolved in either of the two ways.
>>>
>>> 1. If sync mode required is tar over ssh, just disable sync_xattrs
which
>>> is true
>>>     by default.
>>>
>>>      gluster vol geo-rep <master-vol>
<slave-host>::<slave-vol> config
>>> sync_xattrs false
>>>
>>> 2. If sync mode is ok to be changed to rsync. Please do.
>>>           gluster vol geo-rep <master-vol>
<slave-host>::<slave-vol>
>>> use_tarssh false
>>>
>>> NOTE: rsync supports syncing of acls and xattrs where as tar over
ssh
>>> does not.
>>>        In 3.7.0-2, tar over ssh should be used with sync_xattrs to
false
>>>
>>> Hope this helps.
>>>
>>> Thanks and Regards,
>>> Kotresh H R
>>>
>>> ----- Original Message -----
>>>
>>>> From: "wodel youchi" <wodel.youchi at
gmail.com>
>>>> To: "gluster-users" <gluster-users at
gluster.org>
>>>> Sent: Sunday, May 24, 2015 3:31:38 AM
>>>> Subject: [Gluster-users] [Centos7x64] Geo-replication problem
glusterfs
>>>> 3.7.0-2
>>>>
>>>> Hi,
>>>>
>>>> I have two gluster servers in replicated mode as MASTERS
>>>> and one server for replicated geo-replication.
>>>>
>>>> I've updated my glusterfs installation to 3.7.0-2, all
three servers
>>>>
>>>> I've recreated my slave volumes
>>>> I've started the geo-replication, it worked for a while and
now I have
>>>> some
>>>> problmes
>>>>
>>>> 1- Files/directories are not deleted on slave
>>>> 2- New files/rectories are not synced to the slave.
>>>>
>>>> I have these lines on the active master
>>>>
>>>> [2015-05-23 06:21:17.156939] W
>>>> [master(/mnt/brick2/brick):792:log_failures]
>>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>>> 'e529a399-756d-4cb1-9779-0af2822a0d94', 'gid':
0, 'mode': 33152,
>>>> 'entry':
>>>> '.gfid/874799ef-df75-437b-bc8f-3fcd58b54789/main.mdb',
'op': 'CREATE'},
>>>> 2)
>>>> [2015-05-23 06:21:17.158066] W
>>>> [master(/mnt/brick2/brick):792:log_failures]
>>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>>> 'b4bffa4c-2e88-4b60-9f6a-c665c4d9f7ed', 'gid':
0, 'mode': 33152,
>>>> 'entry':
>>>> '.gfid/874799ef-df75-437b-bc8f-3fcd58b54789/main.hdb',
'op': 'CREATE'},
>>>> 2)
>>>> [2015-05-23 06:21:17.159154] W
>>>> [master(/mnt/brick2/brick):792:log_failures]
>>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>>> '9920cdee-6b87-4408-834b-4389f5d451fe', 'gid':
0, 'mode': 33152,
>>>> 'entry':
>>>> '.gfid/874799ef-df75-437b-bc8f-3fcd58b54789/main.db',
'op': 'CREATE'},
>>>> 2)
>>>> [2015-05-23 06:21:17.160242] W
>>>> [master(/mnt/brick2/brick):792:log_failures]
>>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>>> '307756d2-d924-456f-b090-10d3ff9caccb', 'gid':
0, 'mode': 33152,
>>>> 'entry':
>>>> '.gfid/874799ef-df75-437b-bc8f-3fcd58b54789/main.ndb',
'op': 'CREATE'},
>>>> 2)
>>>> [2015-05-23 06:21:17.161283] W
>>>> [master(/mnt/brick2/brick):792:log_failures]
>>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>>> '69ebb4cb-1157-434b-a6e9-386bea81fc1d', 'gid':
0, 'mode': 33152,
>>>> 'entry':
>>>> '.gfid/874799ef-df75-437b-bc8f-3fcd58b54789/COPYING',
'op': 'CREATE'},
>>>> 2)
>>>> [2015-05-23 06:21:17.162368] W
>>>> [master(/mnt/brick2/brick):792:log_failures]
>>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>>> '7d132fda-fc82-4ad8-8b6c-66009999650c', 'gid':
0, 'mode': 33152,
>>>> 'entry':
>>>> '.gfid/f6f2582e-0c5c-4cba-943a-6d5f64baf340/daily.cld',
'op':
>>>> 'CREATE'}, 2)
>>>> [2015-05-23 06:21:17.163718] W
>>>> [master(/mnt/brick2/brick):792:log_failures]
>>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>>> 'd8a0303e-ba45-4e45-a8fd-17994c34687b', 'gid':
0, 'mode': 16832,
>>>> 'entry':
>>>>
>>>>
'.gfid/f6f2582e-0c5c-4cba-943a-6d5f64baf340/clamav-54acc14b44e696e1cfb4a75ecc395fe0',
>>>> 'op': 'MKDIR'}, 2)
>>>> [2015-05-23 06:21:17.165102] W
>>>> [master(/mnt/brick2/brick):792:log_failures]
>>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>>> '49d42bf6-3146-42bd-bc29-e704927d6133', 'gid':
0, 'mode': 16832,
>>>> 'entry':
>>>>
>>>>
'.gfid/f6f2582e-0c5c-4cba-943a-6d5f64baf340/clamav-debec3aa6afe64bffaee8d099e76f3d4',
>>>> 'op': 'MKDIR'}, 2)
>>>> [2015-05-23 06:21:17.166147] W
>>>> [master(/mnt/brick2/brick):792:log_failures]
>>>> _GMaster: ENTRY FAILED: ({'uid': 0, 'gfid':
>>>> '1ddb93ae-3717-4347-910f-607afa67cdb0', 'gid':
0, 'mode': 33152,
>>>> 'entry':
>>>>
>>>>
'.gfid/49d42bf6-3146-42bd-bc29-e704927d6133/clamav-704a1e9a3e2c97ccac127632d7c6b8e4',
>>>> 'op': 'CREATE'}, 2)
>>>>
>>>>
>>>> in the slave lot of lines like this
>>>>
>>>> [2015-05-22 07:53:57.071999] W
[fuse-bridge.c:1970:fuse_create_cbk]
>>>> 0-glusterfs-fuse: 25833:
/.gfid/03a5a40b-c521-47ac-a4e3-916a6df42689 =>
>>>> -1
>>>> (Operation not permitted)
>>>>
>>>>
>>>> in the active master I have 3.7 GB of XSYNC-CHANGELOG.xxxxxxx
files in
>>>>
>>>>
/var/lib/misc/glusterfsd/data2/ssh%3A%2F%2Froot%4010.10.10.10%3Agluster%3A%2F%2F127.0.0.1%3Aslavedata2/e55761a256af4acfe9b4a419be62462a/xsync
>>>>
>>>> I don't know if this is normal.
>>>>
>>>> any idea?
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150526/ff99d225/attachment.html>

Gluster users - May 2015 - [Centos7x64] Geo-replication problem glusterfs 3.7.0-2

[Gluster-users] [Centos7x64] Geo-replication problem glusterfs 3.7.0-2

[Gluster-users] [Centos7x64] Geo-replication problem glusterfs 3.7.0-2