thr3ads.net - Gluster users - [Gluster-users] Geo Replication stops replicating [Jun 2019]

If this information is useful, please help other people find it:
Share via:

deepu srinivasan

2019-Jun-05 08:58 UTC

[Gluster-users] Geo Replication stops replicating

Hi Kotresh, Sunny
Found this log in the slave machine.
> [2019-06-05 08:49:10.632583] I [MSGID: 106488]
> [glusterd-handler.c:1559:__glusterd_handle_cli_get_volume] 0-management:
> Received get vol req
>
> The message "I [MSGID: 106488]
> [glusterd-handler.c:1559:__glusterd_handle_cli_get_volume] 0-management:
> Received get vol req" repeated 2 times between [2019-06-05
08:49:10.632583]
> and [2019-06-05 08:49:10.670863]
>
> The message "I [MSGID: 106496]
> [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd: Received
> mount req" repeated 34 times between [2019-06-05 08:48:41.005398] and
> [2019-06-05 08:50:37.254063]
>
> The message "E [MSGID: 106061]
> [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management: 'option
> mountbroker-root' missing in glusterd vol file" repeated 34 times
between
> [2019-06-05 08:48:41.005434] and [2019-06-05 08:50:37.254079]
>
> The message "W [MSGID: 106176]
> [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management: unsuccessful
> mount request [No such file or directory]" repeated 34 times between
> [2019-06-05 08:48:41.005444] and [2019-06-05 08:50:37.254080]
>
> [2019-06-05 08:50:46.361347] I [MSGID: 106496]
> [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd: Received
> mount req
>
> [2019-06-05 08:50:46.361384] E [MSGID: 106061]
> [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management: 'option
> mountbroker-root' missing in glusterd vol file
>
> [2019-06-05 08:50:46.361419] W [MSGID: 106176]
> [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management: unsuccessful
> mount request [No such file or directory]
>
> The message "I [MSGID: 106496]
> [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd: Received
> mount req" repeated 33 times between [2019-06-05 08:50:46.361347] and
> [2019-06-05 08:52:34.019741]
>
> The message "E [MSGID: 106061]
> [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management: 'option
> mountbroker-root' missing in glusterd vol file" repeated 33 times
between
> [2019-06-05 08:50:46.361384] and [2019-06-05 08:52:34.019757]
>
> The message "W [MSGID: 106176]
> [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management: unsuccessful
> mount request [No such file or directory]" repeated 33 times between
> [2019-06-05 08:50:46.361419] and [2019-06-05 08:52:34.019758]
>
> [2019-06-05 08:52:44.426839] I [MSGID: 106496]
> [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd: Received
> mount req
>
> [2019-06-05 08:52:44.426886] E [MSGID: 106061]
> [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management: 'option
> mountbroker-root' missing in glusterd vol file
>
> [2019-06-05 08:52:44.426896] W [MSGID: 106176]
> [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management: unsuccessful
> mount request [No such file or directory]
>
On Wed, Jun 5, 2019 at 1:06 AM deepu srinivasan <sdeepugd at gmail.com>
wrote:
> Thankyou Kotresh
>
> On Tue, Jun 4, 2019, 11:20 PM Kotresh Hiremath Ravishankar <
> khiremat at redhat.com> wrote:
>
>> Ccing Sunny, who was investing similar issue.
>>
>> On Tue, Jun 4, 2019 at 5:46 PM deepu srinivasan <sdeepugd at
gmail.com>
>> wrote:
>>
>>> Have already added the path in bashrc . Still in faulty state
>>>
>>> On Tue, Jun 4, 2019, 5:27 PM Kotresh Hiremath Ravishankar <
>>> khiremat at redhat.com> wrote:
>>>
>>>> could you please try adding /usr/sbin to $PATH for user
'sas'? If it's
>>>> bash, add 'export PATH=/usr/sbin:$PATH' in
>>>> /home/sas/.bashrc
>>>>
>>>> On Tue, Jun 4, 2019 at 5:24 PM deepu srinivasan <sdeepugd at
gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Kortesh
>>>>> Please find the logs of the above error
>>>>> *Master log snippet*
>>>>>
>>>>>> [2019-06-04 11:52:09.254731] I [resource(worker
>>>>>> /home/sas/gluster/data/code-misc):1379:connect_remote]
SSH: Initializing
>>>>>> SSH connection between master and slave...
>>>>>>  [2019-06-04 11:52:09.308923] D [repce(worker
>>>>>> /home/sas/gluster/data/code-misc):196:push]
RepceClient: call
>>>>>> 89724:139652759443264:1559649129.31 __repce_version__()
...
>>>>>>  [2019-06-04 11:52:09.602792] E [syncdutils(worker
>>>>>>
/home/sas/gluster/data/code-misc):311:log_raise_exception] <top>:
>>>>>> connection to peer is broken
>>>>>>  [2019-06-04 11:52:09.603312] E [syncdutils(worker
>>>>>> /home/sas/gluster/data/code-misc):805:errlog] Popen:
command returned error
>>>>>>   cmd=ssh -oPasswordAuthentication=no
-oStrictHostKeyChecking=no -i
>>>>>> /var/lib/ glusterd/geo-replication/secret.pem -p 22
-oControlMaster=auto -S
>>>>>>
/tmp/gsyncd-aux-ssh-4aL2tc/d893f66e0addc32f7d0080bb503f5185.sock
>>>>>> sas at 192.168.185.107 /usr/libexec/glusterfs/gsyncd
slave code-misc
>>>>>> sas@   192.168.185.107::code-misc --master-node
192.168.185.106
>>>>>> --master-node-id 851b64d0-d885-4ae9-9b38-ab5b15db0fec
--master-brick
>>>>>> /home/sas/gluster/data/code-misc --local-node
192.168.185.122 --local-node-
>>>>>>   id bcaa7af6-c3a1-4411-8e99-4ebecb32eb6a
--slave-timeout 120
>>>>>> --slave-log-level DEBUG --slave-gluster-log-level INFO
>>>>>> --slave-gluster-command-dir /usr/sbin   error=1
>>>>>>  [2019-06-04 11:52:09.614996] I [repce(agent
>>>>>> /home/sas/gluster/data/code-misc):97:service_loop]
RepceServer: terminating
>>>>>> on reaching EOF.
>>>>>>  [2019-06-04 11:52:09.615545] D
[monitor(monitor):271:monitor]
>>>>>> Monitor: worker(/home/sas/gluster/data/code-misc)
connected
>>>>>>  [2019-06-04 11:52:09.616528] I
[monitor(monitor):278:monitor]
>>>>>> Monitor: worker died in startup phase
brick=/home/sas/gluster/data/code-misc
>>>>>>  [2019-06-04 11:52:09.619391] I
>>>>>> [gsyncdstatus(monitor):248:set_worker_status]
GeorepStatus: Worker Status
>>>>>> Change status=Faulty
>>>>>>
>>>>>
>>>>> *Slave log snippet*
>>>>>
>>>>>> [2019-06-04 11:50:09.782668] E [syncdutils(slave
>>>>>>
192.168.185.106/home/sas/gluster/data/code-misc):809:logerr] Popen:
>>>>>> /usr/sbin/gluster> 2 : failed with this errno (No
such file or directory)
>>>>>> [2019-06-04 11:50:11.188167] W [gsyncd(slave
>>>>>>
192.168.185.125/home/sas/gluster/data/code-misc):305:main] <top>:
>>>>>> Session config file not exists, using the default
config
>>>>>>
path=/var/lib/glusterd/geo-replication/code-misc_192.168.185.107_code-misc/gsyncd.conf
>>>>>> [2019-06-04 11:50:11.201070] I [resource(slave
>>>>>>
192.168.185.125/home/sas/gluster/data/code-misc):1098:connect]
>>>>>> GLUSTER: Mounting gluster volume locally...
>>>>>> [2019-06-04 11:50:11.271231] E [resource(slave
>>>>>>
192.168.185.125/home/sas/gluster/data/code-misc):1006:handle_mounter]
>>>>>> MountbrokerMounter: glusterd answered
mnt>>>>>> [2019-06-04 11:50:11.271998] E [syncdutils(slave
>>>>>>
192.168.185.125/home/sas/gluster/data/code-misc):805:errlog] Popen:
>>>>>> command returned error cmd=/usr/sbin/gluster
--remote-host=localhost
>>>>>> system:: mount sas user-map-root=sas aux-gfid-mount acl
log-level=INFO
>>>>>>
log-file=/var/log/glusterfs/geo-replication-slaves/code-misc_192.168.185.107_code-misc/mnt-192.168.185.125-home-sas-gluster-data-code-misc.log
>>>>>> volfile-server=localhost volfile-id=code-misc
client-pid=-1 error=1
>>>>>> [2019-06-04 11:50:11.272113] E [syncdutils(slave
>>>>>>
192.168.185.125/home/sas/gluster/data/code-misc):809:logerr] Popen:
>>>>>> /usr/sbin/gluster> 2 : failed with this errno (No
such file or directory)
>>>>>
>>>>>
>>>>> On Tue, Jun 4, 2019 at 5:10 PM deepu srinivasan
<sdeepugd at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi
>>>>>> As discussed I have upgraded gluster from 4.1 to 6.2
version. But the
>>>>>> Geo replication failed to start.
>>>>>> Stays in faulty state
>>>>>>
>>>>>> On Fri, May 31, 2019, 5:32 PM deepu srinivasan
<sdeepugd at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Checked the data. It remains in 2708. No progress.
>>>>>>>
>>>>>>> On Fri, May 31, 2019 at 4:36 PM Kotresh Hiremath
Ravishankar <
>>>>>>> khiremat at redhat.com> wrote:
>>>>>>>
>>>>>>>> That means it could be working and the defunct
process might be
>>>>>>>> some old zombie one. Could you check, that data
progress ?
>>>>>>>>
>>>>>>>> On Fri, May 31, 2019 at 4:29 PM deepu
srinivasan <
>>>>>>>> sdeepugd at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi
>>>>>>>>> When i change the rsync option the rsync
process doesnt seem to
>>>>>>>>> start . Only a defunt process is listed in
ps aux. Only when i set rsync
>>>>>>>>> option to " " and restart all the
process the rsync process is listed in ps
>>>>>>>>> aux.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, May 31, 2019 at 4:23 PM Kotresh
Hiremath Ravishankar <
>>>>>>>>> khiremat at redhat.com> wrote:
>>>>>>>>>
>>>>>>>>>> Yes, rsync config option should have
fixed this issue.
>>>>>>>>>>
>>>>>>>>>> Could you share the output of the
following?
>>>>>>>>>>
>>>>>>>>>> 1. gluster volume geo-replication
<MASTERVOL>
>>>>>>>>>> <SLAVEHOST>::<SLAVEVOL>
config rsync-options
>>>>>>>>>> 2. ps -ef | grep rsync
>>>>>>>>>>
>>>>>>>>>> On Fri, May 31, 2019 at 4:11 PM deepu
srinivasan <
>>>>>>>>>> sdeepugd at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Done.
>>>>>>>>>>> We got the following result .
>>>>>>>>>>>
>>>>>>>>>>>> 1559298781.338234 write(2,
"rsync: link_stat
>>>>>>>>>>>>
\"/tmp/gsyncd-aux-mount-EEJ_sY/.gfid/3fa6aed8-802e-4efe-9903-8bc171176d88\"
>>>>>>>>>>>> failed: No such file or
directory (2)", 128
>>>>>>>>>>>
>>>>>>>>>>> seems like a file is missing ?
>>>>>>>>>>>
>>>>>>>>>>> On Fri, May 31, 2019 at 3:25 PM
Kotresh Hiremath Ravishankar <
>>>>>>>>>>> khiremat at redhat.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> Could you take the strace with
with more string size? The
>>>>>>>>>>>> argument strings are truncated.
>>>>>>>>>>>>
>>>>>>>>>>>> strace -s 500 -ttt -T -p
<rsync pid>
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, May 31, 2019 at 3:17 PM
deepu srinivasan <
>>>>>>>>>>>> sdeepugd at gmail.com>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Kotresh
>>>>>>>>>>>>> The above-mentioned work
around did not work properly.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, May 31, 2019 at
3:16 PM deepu srinivasan <
>>>>>>>>>>>>> sdeepugd at gmail.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Kotresh
>>>>>>>>>>>>>> We have tried the
above-mentioned rsync option and we are
>>>>>>>>>>>>>> planning to have the
version upgrade to 6.0.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, May 31, 2019 at
11:04 AM Kotresh Hiremath Ravishankar
>>>>>>>>>>>>>> <khiremat at
redhat.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This looks like the
hang because stderr buffer filled up
>>>>>>>>>>>>>>> with errors
messages and no one reading it.
>>>>>>>>>>>>>>> I think this issue
is fixed in latest releases. As a
>>>>>>>>>>>>>>> workaround, you can
do following and check if it works.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Prerequisite:
>>>>>>>>>>>>>>>  rsync version
should be > 3.1.0
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Workaround:
>>>>>>>>>>>>>>> gluster volume
geo-replication <MASTERVOL>
>>>>>>>>>>>>>>>
<SLAVEHOST>::<SLAVEVOL> config rsync-options "--ignore-
>>>>>>>>>>>>>>> missing-args"
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Kotresh HR
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Thu, May 30,
2019 at 5:39 PM deepu srinivasan <
>>>>>>>>>>>>>>> sdeepugd at
gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi
>>>>>>>>>>>>>>>> We were
evaluating Gluster geo Replication between two DCs
>>>>>>>>>>>>>>>> one is in US
west and one is in US east. We took multiple trials for
>>>>>>>>>>>>>>>> different file
size.
>>>>>>>>>>>>>>>> The Geo
Replication tends to stop replicating but while
>>>>>>>>>>>>>>>> checking the
status it appears to be in Active state. But the slave volume
>>>>>>>>>>>>>>>> did not
increase in size.
>>>>>>>>>>>>>>>> So we have
restarted the geo-replication session and
>>>>>>>>>>>>>>>> checked the
status. The status was in an active state and it was in History
>>>>>>>>>>>>>>>> Crawl for a
long time. We have enabled the DEBUG mode in logging and
>>>>>>>>>>>>>>>> checked for any
error.
>>>>>>>>>>>>>>>> There was
around 2000 file appeared for syncing candidate.
>>>>>>>>>>>>>>>> The Rsync
process starts but the rsync did not happen in the slave volume.
>>>>>>>>>>>>>>>> Every time the
rsync process appears in the "ps auxxx" list but the
>>>>>>>>>>>>>>>> replication did
not happen in the slave end. What would be the cause of
>>>>>>>>>>>>>>>> this problem?
Is there anyway to debug it?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> We have also
checked the strace of the rync program.
>>>>>>>>>>>>>>>> it displays
something like this
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> "write(2,
"rsync: link_stat \"/tmp/gsyncd-au"..., 128"
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> We are using
the below specs
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Gluster version
- 4.1.7
>>>>>>>>>>>>>>>> Sync mode -
rsync
>>>>>>>>>>>>>>>> Volume - 1x3 in
each end (master and slave)
>>>>>>>>>>>>>>>> Intranet
Bandwidth - 10 Gig
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Thanks and Regards,
>>>>>>>>>>>>>>> Kotresh H R
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Thanks and Regards,
>>>>>>>>>>>> Kotresh H R
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Thanks and Regards,
>>>>>>>>>> Kotresh H R
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Thanks and Regards,
>>>>>>>> Kotresh H R
>>>>>>>>
>>>>>>>
>>>>
>>>> --
>>>> Thanks and Regards,
>>>> Kotresh H R
>>>>
>>>
>>
>> --
>> Thanks and Regards,
>> Kotresh H R
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190605/d351006c/attachment-0001.html>

deepu srinivasan

2019-Jun-06 04:54 UTC

head link

[Gluster-users] Geo Replication stops replicating

Hi Kotresh, Sunny
I Have mailed the logs I found in one of the slave machines. Is there
anything to do with permission? Please help.

On Wed, Jun 5, 2019 at 2:28 PM deepu srinivasan <sdeepugd at gmail.com>
wrote:
> Hi Kotresh, Sunny
> Found this log in the slave machine.
>
>> [2019-06-05 08:49:10.632583] I [MSGID: 106488]
>> [glusterd-handler.c:1559:__glusterd_handle_cli_get_volume]
0-management:
>> Received get vol req
>>
>> The message "I [MSGID: 106488]
>> [glusterd-handler.c:1559:__glusterd_handle_cli_get_volume]
0-management:
>> Received get vol req" repeated 2 times between [2019-06-05
08:49:10.632583]
>> and [2019-06-05 08:49:10.670863]
>>
>> The message "I [MSGID: 106496]
>> [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd: Received
>> mount req" repeated 34 times between [2019-06-05 08:48:41.005398]
and
>> [2019-06-05 08:50:37.254063]
>>
>> The message "E [MSGID: 106061]
>> [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management:
'option
>> mountbroker-root' missing in glusterd vol file" repeated 34
times between
>> [2019-06-05 08:48:41.005434] and [2019-06-05 08:50:37.254079]
>>
>> The message "W [MSGID: 106176]
>> [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management:
unsuccessful
>> mount request [No such file or directory]" repeated 34 times
between
>> [2019-06-05 08:48:41.005444] and [2019-06-05 08:50:37.254080]
>>
>> [2019-06-05 08:50:46.361347] I [MSGID: 106496]
>> [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd: Received
>> mount req
>>
>> [2019-06-05 08:50:46.361384] E [MSGID: 106061]
>> [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management:
'option
>> mountbroker-root' missing in glusterd vol file
>>
>> [2019-06-05 08:50:46.361419] W [MSGID: 106176]
>> [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management:
unsuccessful
>> mount request [No such file or directory]
>>
>> The message "I [MSGID: 106496]
>> [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd: Received
>> mount req" repeated 33 times between [2019-06-05 08:50:46.361347]
and
>> [2019-06-05 08:52:34.019741]
>>
>> The message "E [MSGID: 106061]
>> [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management:
'option
>> mountbroker-root' missing in glusterd vol file" repeated 33
times between
>> [2019-06-05 08:50:46.361384] and [2019-06-05 08:52:34.019757]
>>
>> The message "W [MSGID: 106176]
>> [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management:
unsuccessful
>> mount request [No such file or directory]" repeated 33 times
between
>> [2019-06-05 08:50:46.361419] and [2019-06-05 08:52:34.019758]
>>
>> [2019-06-05 08:52:44.426839] I [MSGID: 106496]
>> [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd: Received
>> mount req
>>
>> [2019-06-05 08:52:44.426886] E [MSGID: 106061]
>> [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management:
'option
>> mountbroker-root' missing in glusterd vol file
>>
>> [2019-06-05 08:52:44.426896] W [MSGID: 106176]
>> [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management:
unsuccessful
>> mount request [No such file or directory]
>>
>
> On Wed, Jun 5, 2019 at 1:06 AM deepu srinivasan <sdeepugd at
gmail.com>
> wrote:
>
>> Thankyou Kotresh
>>
>> On Tue, Jun 4, 2019, 11:20 PM Kotresh Hiremath Ravishankar <
>> khiremat at redhat.com> wrote:
>>
>>> Ccing Sunny, who was investing similar issue.
>>>
>>> On Tue, Jun 4, 2019 at 5:46 PM deepu srinivasan <sdeepugd at
gmail.com>
>>> wrote:
>>>
>>>> Have already added the path in bashrc . Still in faulty state
>>>>
>>>> On Tue, Jun 4, 2019, 5:27 PM Kotresh Hiremath Ravishankar <
>>>> khiremat at redhat.com> wrote:
>>>>
>>>>> could you please try adding /usr/sbin to $PATH for user
'sas'? If it's
>>>>> bash, add 'export PATH=/usr/sbin:$PATH' in
>>>>> /home/sas/.bashrc
>>>>>
>>>>> On Tue, Jun 4, 2019 at 5:24 PM deepu srinivasan
<sdeepugd at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Kortesh
>>>>>> Please find the logs of the above error
>>>>>> *Master log snippet*
>>>>>>
>>>>>>> [2019-06-04 11:52:09.254731] I [resource(worker
>>>>>>>
/home/sas/gluster/data/code-misc):1379:connect_remote] SSH: Initializing
>>>>>>> SSH connection between master and slave...
>>>>>>>  [2019-06-04 11:52:09.308923] D [repce(worker
>>>>>>> /home/sas/gluster/data/code-misc):196:push]
RepceClient: call
>>>>>>> 89724:139652759443264:1559649129.31
__repce_version__() ...
>>>>>>>  [2019-06-04 11:52:09.602792] E [syncdutils(worker
>>>>>>>
/home/sas/gluster/data/code-misc):311:log_raise_exception] <top>:
>>>>>>> connection to peer is broken
>>>>>>>  [2019-06-04 11:52:09.603312] E [syncdutils(worker
>>>>>>> /home/sas/gluster/data/code-misc):805:errlog]
Popen: command returned error
>>>>>>>   cmd=ssh -oPasswordAuthentication=no
-oStrictHostKeyChecking=no -i
>>>>>>> /var/lib/ glusterd/geo-replication/secret.pem -p 22
-oControlMaster=auto -S
>>>>>>>
/tmp/gsyncd-aux-ssh-4aL2tc/d893f66e0addc32f7d0080bb503f5185.sock
>>>>>>> sas at 192.168.185.107
/usr/libexec/glusterfs/gsyncd slave code-misc
>>>>>>> sas@   192.168.185.107::code-misc --master-node
192.168.185.106
>>>>>>> --master-node-id
851b64d0-d885-4ae9-9b38-ab5b15db0fec --master-brick
>>>>>>> /home/sas/gluster/data/code-misc --local-node
192.168.185.122 --local-node-
>>>>>>>   id bcaa7af6-c3a1-4411-8e99-4ebecb32eb6a
--slave-timeout 120
>>>>>>> --slave-log-level DEBUG --slave-gluster-log-level
INFO
>>>>>>> --slave-gluster-command-dir /usr/sbin   error=1
>>>>>>>  [2019-06-04 11:52:09.614996] I [repce(agent
>>>>>>> /home/sas/gluster/data/code-misc):97:service_loop]
RepceServer: terminating
>>>>>>> on reaching EOF.
>>>>>>>  [2019-06-04 11:52:09.615545] D
[monitor(monitor):271:monitor]
>>>>>>> Monitor: worker(/home/sas/gluster/data/code-misc)
connected
>>>>>>>  [2019-06-04 11:52:09.616528] I
[monitor(monitor):278:monitor]
>>>>>>> Monitor: worker died in startup phase
brick=/home/sas/gluster/data/code-misc
>>>>>>>  [2019-06-04 11:52:09.619391] I
>>>>>>> [gsyncdstatus(monitor):248:set_worker_status]
GeorepStatus: Worker Status
>>>>>>> Change status=Faulty
>>>>>>>
>>>>>>
>>>>>> *Slave log snippet*
>>>>>>
>>>>>>> [2019-06-04 11:50:09.782668] E [syncdutils(slave
>>>>>>>
192.168.185.106/home/sas/gluster/data/code-misc):809:logerr] Popen:
>>>>>>> /usr/sbin/gluster> 2 : failed with this errno
(No such file or directory)
>>>>>>> [2019-06-04 11:50:11.188167] W [gsyncd(slave
>>>>>>>
192.168.185.125/home/sas/gluster/data/code-misc):305:main] <top>:
>>>>>>> Session config file not exists, using the default
config
>>>>>>>
path=/var/lib/glusterd/geo-replication/code-misc_192.168.185.107_code-misc/gsyncd.conf
>>>>>>> [2019-06-04 11:50:11.201070] I [resource(slave
>>>>>>>
192.168.185.125/home/sas/gluster/data/code-misc):1098:connect]
>>>>>>> GLUSTER: Mounting gluster volume locally...
>>>>>>> [2019-06-04 11:50:11.271231] E [resource(slave
>>>>>>>
192.168.185.125/home/sas/gluster/data/code-misc):1006:handle_mounter]
>>>>>>> MountbrokerMounter: glusterd answered
mnt>>>>>>> [2019-06-04 11:50:11.271998] E [syncdutils(slave
>>>>>>>
192.168.185.125/home/sas/gluster/data/code-misc):805:errlog] Popen:
>>>>>>> command returned error cmd=/usr/sbin/gluster
--remote-host=localhost
>>>>>>> system:: mount sas user-map-root=sas aux-gfid-mount
acl log-level=INFO
>>>>>>>
log-file=/var/log/glusterfs/geo-replication-slaves/code-misc_192.168.185.107_code-misc/mnt-192.168.185.125-home-sas-gluster-data-code-misc.log
>>>>>>> volfile-server=localhost volfile-id=code-misc
client-pid=-1 error=1
>>>>>>> [2019-06-04 11:50:11.272113] E [syncdutils(slave
>>>>>>>
192.168.185.125/home/sas/gluster/data/code-misc):809:logerr] Popen:
>>>>>>> /usr/sbin/gluster> 2 : failed with this errno
(No such file or directory)
>>>>>>
>>>>>>
>>>>>> On Tue, Jun 4, 2019 at 5:10 PM deepu srinivasan
<sdeepugd at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi
>>>>>>> As discussed I have upgraded gluster from 4.1 to
6.2 version. But
>>>>>>> the Geo replication failed to start.
>>>>>>> Stays in faulty state
>>>>>>>
>>>>>>> On Fri, May 31, 2019, 5:32 PM deepu srinivasan
<sdeepugd at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Checked the data. It remains in 2708. No
progress.
>>>>>>>>
>>>>>>>> On Fri, May 31, 2019 at 4:36 PM Kotresh
Hiremath Ravishankar <
>>>>>>>> khiremat at redhat.com> wrote:
>>>>>>>>
>>>>>>>>> That means it could be working and the
defunct process might be
>>>>>>>>> some old zombie one. Could you check, that
data progress ?
>>>>>>>>>
>>>>>>>>> On Fri, May 31, 2019 at 4:29 PM deepu
srinivasan <
>>>>>>>>> sdeepugd at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi
>>>>>>>>>> When i change the rsync option the
rsync process doesnt seem to
>>>>>>>>>> start . Only a defunt process is listed
in ps aux. Only when i set rsync
>>>>>>>>>> option to " " and restart all
the process the rsync process is listed in ps
>>>>>>>>>> aux.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, May 31, 2019 at 4:23 PM Kotresh
Hiremath Ravishankar <
>>>>>>>>>> khiremat at redhat.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yes, rsync config option should
have fixed this issue.
>>>>>>>>>>>
>>>>>>>>>>> Could you share the output of the
following?
>>>>>>>>>>>
>>>>>>>>>>> 1. gluster volume geo-replication
<MASTERVOL>
>>>>>>>>>>> <SLAVEHOST>::<SLAVEVOL>
config rsync-options
>>>>>>>>>>> 2. ps -ef | grep rsync
>>>>>>>>>>>
>>>>>>>>>>> On Fri, May 31, 2019 at 4:11 PM
deepu srinivasan <
>>>>>>>>>>> sdeepugd at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Done.
>>>>>>>>>>>> We got the following result .
>>>>>>>>>>>>
>>>>>>>>>>>>> 1559298781.338234 write(2,
"rsync: link_stat
>>>>>>>>>>>>>
\"/tmp/gsyncd-aux-mount-EEJ_sY/.gfid/3fa6aed8-802e-4efe-9903-8bc171176d88\"
>>>>>>>>>>>>> failed: No such file or
directory (2)", 128
>>>>>>>>>>>>
>>>>>>>>>>>> seems like a file is missing ?
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, May 31, 2019 at 3:25 PM
Kotresh Hiremath Ravishankar <
>>>>>>>>>>>> khiremat at redhat.com>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Could you take the strace
with with more string size? The
>>>>>>>>>>>>> argument strings are
truncated.
>>>>>>>>>>>>>
>>>>>>>>>>>>> strace -s 500 -ttt -T -p
<rsync pid>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, May 31, 2019 at
3:17 PM deepu srinivasan <
>>>>>>>>>>>>> sdeepugd at gmail.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Kotresh
>>>>>>>>>>>>>> The above-mentioned
work around did not work properly.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, May 31, 2019 at
3:16 PM deepu srinivasan <
>>>>>>>>>>>>>> sdeepugd at
gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Kotresh
>>>>>>>>>>>>>>> We have tried the
above-mentioned rsync option and we are
>>>>>>>>>>>>>>> planning to have
the version upgrade to 6.0.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, May 31,
2019 at 11:04 AM Kotresh Hiremath
>>>>>>>>>>>>>>> Ravishankar
<khiremat at redhat.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This looks like
the hang because stderr buffer filled up
>>>>>>>>>>>>>>>> with errors
messages and no one reading it.
>>>>>>>>>>>>>>>> I think this
issue is fixed in latest releases. As a
>>>>>>>>>>>>>>>> workaround, you
can do following and check if it works.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Prerequisite:
>>>>>>>>>>>>>>>>  rsync version
should be > 3.1.0
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Workaround:
>>>>>>>>>>>>>>>> gluster volume
geo-replication <MASTERVOL>
>>>>>>>>>>>>>>>>
<SLAVEHOST>::<SLAVEVOL> config rsync-options "--ignore-
>>>>>>>>>>>>>>>>
missing-args"
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Kotresh HR
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, May 30,
2019 at 5:39 PM deepu srinivasan <
>>>>>>>>>>>>>>>> sdeepugd at
gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi
>>>>>>>>>>>>>>>>> We were
evaluating Gluster geo Replication between two DCs
>>>>>>>>>>>>>>>>> one is in
US west and one is in US east. We took multiple trials for
>>>>>>>>>>>>>>>>> different
file size.
>>>>>>>>>>>>>>>>> The Geo
Replication tends to stop replicating but while
>>>>>>>>>>>>>>>>> checking
the status it appears to be in Active state. But the slave volume
>>>>>>>>>>>>>>>>> did not
increase in size.
>>>>>>>>>>>>>>>>> So we have
restarted the geo-replication session and
>>>>>>>>>>>>>>>>> checked the
status. The status was in an active state and it was in History
>>>>>>>>>>>>>>>>> Crawl for a
long time. We have enabled the DEBUG mode in logging and
>>>>>>>>>>>>>>>>> checked for
any error.
>>>>>>>>>>>>>>>>> There was
around 2000 file appeared for syncing candidate.
>>>>>>>>>>>>>>>>> The Rsync
process starts but the rsync did not happen in the slave volume.
>>>>>>>>>>>>>>>>> Every time
the rsync process appears in the "ps auxxx" list but the
>>>>>>>>>>>>>>>>> replication
did not happen in the slave end. What would be the cause of
>>>>>>>>>>>>>>>>> this
problem? Is there anyway to debug it?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> We have
also checked the strace of the rync program.
>>>>>>>>>>>>>>>>> it displays
something like this
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
"write(2, "rsync: link_stat \"/tmp/gsyncd-au"..., 128"
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> We are
using the below specs
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Gluster
version - 4.1.7
>>>>>>>>>>>>>>>>> Sync mode -
rsync
>>>>>>>>>>>>>>>>> Volume -
1x3 in each end (master and slave)
>>>>>>>>>>>>>>>>> Intranet
Bandwidth - 10 Gig
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Thanks and
Regards,
>>>>>>>>>>>>>>>> Kotresh H R
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Thanks and Regards,
>>>>>>>>>>>>> Kotresh H R
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Thanks and Regards,
>>>>>>>>>>> Kotresh H R
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Thanks and Regards,
>>>>>>>>> Kotresh H R
>>>>>>>>>
>>>>>>>>
>>>>>
>>>>> --
>>>>> Thanks and Regards,
>>>>> Kotresh H R
>>>>>
>>>>
>>>
>>> --
>>> Thanks and Regards,
>>> Kotresh H R
>>>
>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190606/6b155846/attachment-0001.html>

Kotresh Hiremath Ravishankar

2019-Jun-06 04:58 UTC

head link

[Gluster-users] Geo Replication stops replicating

Hi,

I think the steps to setup non-root geo-rep is not followed properly. The
following entry is missing in glusterd vol file which is required.

The message "E [MSGID: 106061]
[glusterd-mountbroker.c:555:glusterd_do_mount] 0-management: 'option
mountbroker-root' missing in glusterd vol file" repeated 33 times
between
[2019-06-05 08:50:46.361384] and [2019-06-05 08:52:34.019757]

Could you please the steps from below?

https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html-single/administration_guide/index#Setting_Up_the_Environment_for_a_Secure_Geo-replication_Slave

And let us know if you still face the issue.




On Thu, Jun 6, 2019 at 10:24 AM deepu srinivasan <sdeepugd at gmail.com>
wrote:
> Hi Kotresh, Sunny
> I Have mailed the logs I found in one of the slave machines. Is there
> anything to do with permission? Please help.
>
> On Wed, Jun 5, 2019 at 2:28 PM deepu srinivasan <sdeepugd at
gmail.com>
> wrote:
>
>> Hi Kotresh, Sunny
>> Found this log in the slave machine.
>>
>>> [2019-06-05 08:49:10.632583] I [MSGID: 106488]
>>> [glusterd-handler.c:1559:__glusterd_handle_cli_get_volume]
0-management:
>>> Received get vol req
>>>
>>> The message "I [MSGID: 106488]
>>> [glusterd-handler.c:1559:__glusterd_handle_cli_get_volume]
0-management:
>>> Received get vol req" repeated 2 times between [2019-06-05
08:49:10.632583]
>>> and [2019-06-05 08:49:10.670863]
>>>
>>> The message "I [MSGID: 106496]
>>> [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd:
Received
>>> mount req" repeated 34 times between [2019-06-05
08:48:41.005398] and
>>> [2019-06-05 08:50:37.254063]
>>>
>>> The message "E [MSGID: 106061]
>>> [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management:
'option
>>> mountbroker-root' missing in glusterd vol file" repeated
34 times between
>>> [2019-06-05 08:48:41.005434] and [2019-06-05 08:50:37.254079]
>>>
>>> The message "W [MSGID: 106176]
>>> [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management:
unsuccessful
>>> mount request [No such file or directory]" repeated 34 times
between
>>> [2019-06-05 08:48:41.005444] and [2019-06-05 08:50:37.254080]
>>>
>>> [2019-06-05 08:50:46.361347] I [MSGID: 106496]
>>> [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd:
Received
>>> mount req
>>>
>>> [2019-06-05 08:50:46.361384] E [MSGID: 106061]
>>> [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management:
'option
>>> mountbroker-root' missing in glusterd vol file
>>>
>>> [2019-06-05 08:50:46.361419] W [MSGID: 106176]
>>> [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management:
unsuccessful
>>> mount request [No such file or directory]
>>>
>>> The message "I [MSGID: 106496]
>>> [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd:
Received
>>> mount req" repeated 33 times between [2019-06-05
08:50:46.361347] and
>>> [2019-06-05 08:52:34.019741]
>>>
>>> The message "E [MSGID: 106061]
>>> [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management:
'option
>>> mountbroker-root' missing in glusterd vol file" repeated
33 times between
>>> [2019-06-05 08:50:46.361384] and [2019-06-05 08:52:34.019757]
>>>
>>> The message "W [MSGID: 106176]
>>> [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management:
unsuccessful
>>> mount request [No such file or directory]" repeated 33 times
between
>>> [2019-06-05 08:50:46.361419] and [2019-06-05 08:52:34.019758]
>>>
>>> [2019-06-05 08:52:44.426839] I [MSGID: 106496]
>>> [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd:
Received
>>> mount req
>>>
>>> [2019-06-05 08:52:44.426886] E [MSGID: 106061]
>>> [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management:
'option
>>> mountbroker-root' missing in glusterd vol file
>>>
>>> [2019-06-05 08:52:44.426896] W [MSGID: 106176]
>>> [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management:
unsuccessful
>>> mount request [No such file or directory]
>>>
>>
>> On Wed, Jun 5, 2019 at 1:06 AM deepu srinivasan <sdeepugd at
gmail.com>
>> wrote:
>>
>>> Thankyou Kotresh
>>>
>>> On Tue, Jun 4, 2019, 11:20 PM Kotresh Hiremath Ravishankar <
>>> khiremat at redhat.com> wrote:
>>>
>>>> Ccing Sunny, who was investing similar issue.
>>>>
>>>> On Tue, Jun 4, 2019 at 5:46 PM deepu srinivasan <sdeepugd at
gmail.com>
>>>> wrote:
>>>>
>>>>> Have already added the path in bashrc . Still in faulty
state
>>>>>
>>>>> On Tue, Jun 4, 2019, 5:27 PM Kotresh Hiremath Ravishankar
<
>>>>> khiremat at redhat.com> wrote:
>>>>>
>>>>>> could you please try adding /usr/sbin to $PATH for user
'sas'? If
>>>>>> it's bash, add 'export
PATH=/usr/sbin:$PATH' in
>>>>>> /home/sas/.bashrc
>>>>>>
>>>>>> On Tue, Jun 4, 2019 at 5:24 PM deepu srinivasan
<sdeepugd at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Kortesh
>>>>>>> Please find the logs of the above error
>>>>>>> *Master log snippet*
>>>>>>>
>>>>>>>> [2019-06-04 11:52:09.254731] I [resource(worker
>>>>>>>>
/home/sas/gluster/data/code-misc):1379:connect_remote] SSH: Initializing
>>>>>>>> SSH connection between master and slave...
>>>>>>>>  [2019-06-04 11:52:09.308923] D [repce(worker
>>>>>>>> /home/sas/gluster/data/code-misc):196:push]
RepceClient: call
>>>>>>>> 89724:139652759443264:1559649129.31
__repce_version__() ...
>>>>>>>>  [2019-06-04 11:52:09.602792] E
[syncdutils(worker
>>>>>>>>
/home/sas/gluster/data/code-misc):311:log_raise_exception] <top>:
>>>>>>>> connection to peer is broken
>>>>>>>>  [2019-06-04 11:52:09.603312] E
[syncdutils(worker
>>>>>>>> /home/sas/gluster/data/code-misc):805:errlog]
Popen: command returned error
>>>>>>>>   cmd=ssh -oPasswordAuthentication=no
-oStrictHostKeyChecking=no -i
>>>>>>>> /var/lib/ glusterd/geo-replication/secret.pem
-p 22 -oControlMaster=auto -S
>>>>>>>>
/tmp/gsyncd-aux-ssh-4aL2tc/d893f66e0addc32f7d0080bb503f5185.sock
>>>>>>>> sas at 192.168.185.107
/usr/libexec/glusterfs/gsyncd slave code-misc
>>>>>>>> sas@   192.168.185.107::code-misc --master-node
192.168.185.106
>>>>>>>> --master-node-id
851b64d0-d885-4ae9-9b38-ab5b15db0fec --master-brick
>>>>>>>> /home/sas/gluster/data/code-misc --local-node
192.168.185.122 --local-node-
>>>>>>>>   id bcaa7af6-c3a1-4411-8e99-4ebecb32eb6a
--slave-timeout 120
>>>>>>>> --slave-log-level DEBUG
--slave-gluster-log-level INFO
>>>>>>>> --slave-gluster-command-dir /usr/sbin   error=1
>>>>>>>>  [2019-06-04 11:52:09.614996] I [repce(agent
>>>>>>>>
/home/sas/gluster/data/code-misc):97:service_loop] RepceServer: terminating
>>>>>>>> on reaching EOF.
>>>>>>>>  [2019-06-04 11:52:09.615545] D
[monitor(monitor):271:monitor]
>>>>>>>> Monitor:
worker(/home/sas/gluster/data/code-misc) connected
>>>>>>>>  [2019-06-04 11:52:09.616528] I
[monitor(monitor):278:monitor]
>>>>>>>> Monitor: worker died in startup phase
brick=/home/sas/gluster/data/code-misc
>>>>>>>>  [2019-06-04 11:52:09.619391] I
>>>>>>>> [gsyncdstatus(monitor):248:set_worker_status]
GeorepStatus: Worker Status
>>>>>>>> Change status=Faulty
>>>>>>>>
>>>>>>>
>>>>>>> *Slave log snippet*
>>>>>>>
>>>>>>>> [2019-06-04 11:50:09.782668] E
[syncdutils(slave
>>>>>>>>
192.168.185.106/home/sas/gluster/data/code-misc):809:logerr]
>>>>>>>> Popen: /usr/sbin/gluster> 2 : failed with
this errno (No such file or
>>>>>>>> directory)
>>>>>>>> [2019-06-04 11:50:11.188167] W [gsyncd(slave
>>>>>>>>
192.168.185.125/home/sas/gluster/data/code-misc):305:main] <top>:
>>>>>>>> Session config file not exists, using the
default config
>>>>>>>>
path=/var/lib/glusterd/geo-replication/code-misc_192.168.185.107_code-misc/gsyncd.conf
>>>>>>>> [2019-06-04 11:50:11.201070] I [resource(slave
>>>>>>>>
192.168.185.125/home/sas/gluster/data/code-misc):1098:connect]
>>>>>>>> GLUSTER: Mounting gluster volume locally...
>>>>>>>> [2019-06-04 11:50:11.271231] E [resource(slave
>>>>>>>>
192.168.185.125/home/sas/gluster/data/code-misc):1006:handle_mounter]
>>>>>>>> MountbrokerMounter: glusterd answered
mnt>>>>>>>> [2019-06-04 11:50:11.271998] E
[syncdutils(slave
>>>>>>>>
192.168.185.125/home/sas/gluster/data/code-misc):805:errlog]
>>>>>>>> Popen: command returned error
cmd=/usr/sbin/gluster --remote-host=localhost
>>>>>>>> system:: mount sas user-map-root=sas
aux-gfid-mount acl log-level=INFO
>>>>>>>>
log-file=/var/log/glusterfs/geo-replication-slaves/code-misc_192.168.185.107_code-misc/mnt-192.168.185.125-home-sas-gluster-data-code-misc.log
>>>>>>>> volfile-server=localhost volfile-id=code-misc
client-pid=-1 error=1
>>>>>>>> [2019-06-04 11:50:11.272113] E
[syncdutils(slave
>>>>>>>>
192.168.185.125/home/sas/gluster/data/code-misc):809:logerr]
>>>>>>>> Popen: /usr/sbin/gluster> 2 : failed with
this errno (No such file or
>>>>>>>> directory)
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jun 4, 2019 at 5:10 PM deepu srinivasan
<sdeepugd at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi
>>>>>>>> As discussed I have upgraded gluster from 4.1
to 6.2 version. But
>>>>>>>> the Geo replication failed to start.
>>>>>>>> Stays in faulty state
>>>>>>>>
>>>>>>>> On Fri, May 31, 2019, 5:32 PM deepu srinivasan
<sdeepugd at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Checked the data. It remains in 2708. No
progress.
>>>>>>>>>
>>>>>>>>> On Fri, May 31, 2019 at 4:36 PM Kotresh
Hiremath Ravishankar <
>>>>>>>>> khiremat at redhat.com> wrote:
>>>>>>>>>
>>>>>>>>>> That means it could be working and the
defunct process might be
>>>>>>>>>> some old zombie one. Could you check,
that data progress ?
>>>>>>>>>>
>>>>>>>>>> On Fri, May 31, 2019 at 4:29 PM deepu
srinivasan <
>>>>>>>>>> sdeepugd at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi
>>>>>>>>>>> When i change the rsync option the
rsync process doesnt seem to
>>>>>>>>>>> start . Only a defunt process is
listed in ps aux. Only when i set rsync
>>>>>>>>>>> option to " " and restart
all the process the rsync process is listed in ps
>>>>>>>>>>> aux.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Fri, May 31, 2019 at 4:23 PM
Kotresh Hiremath Ravishankar <
>>>>>>>>>>> khiremat at redhat.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Yes, rsync config option should
have fixed this issue.
>>>>>>>>>>>>
>>>>>>>>>>>> Could you share the output of
the following?
>>>>>>>>>>>>
>>>>>>>>>>>> 1. gluster volume
geo-replication <MASTERVOL>
>>>>>>>>>>>>
<SLAVEHOST>::<SLAVEVOL> config rsync-options
>>>>>>>>>>>> 2. ps -ef | grep rsync
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, May 31, 2019 at 4:11 PM
deepu srinivasan <
>>>>>>>>>>>> sdeepugd at gmail.com>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Done.
>>>>>>>>>>>>> We got the following result
.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> 1559298781.338234
write(2, "rsync: link_stat
>>>>>>>>>>>>>>
\"/tmp/gsyncd-aux-mount-EEJ_sY/.gfid/3fa6aed8-802e-4efe-9903-8bc171176d88\"
>>>>>>>>>>>>>> failed: No such file or
directory (2)", 128
>>>>>>>>>>>>>
>>>>>>>>>>>>> seems like a file is
missing ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, May 31, 2019 at
3:25 PM Kotresh Hiremath Ravishankar <
>>>>>>>>>>>>> khiremat at redhat.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Could you take the
strace with with more string size? The
>>>>>>>>>>>>>> argument strings are
truncated.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> strace -s 500 -ttt -T
-p <rsync pid>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, May 31, 2019 at
3:17 PM deepu srinivasan <
>>>>>>>>>>>>>> sdeepugd at
gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Kotresh
>>>>>>>>>>>>>>> The above-mentioned
work around did not work properly.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, May 31,
2019 at 3:16 PM deepu srinivasan <
>>>>>>>>>>>>>>> sdeepugd at
gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Kotresh
>>>>>>>>>>>>>>>> We have tried
the above-mentioned rsync option and we are
>>>>>>>>>>>>>>>> planning to
have the version upgrade to 6.0.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, May 31,
2019 at 11:04 AM Kotresh Hiremath
>>>>>>>>>>>>>>>> Ravishankar
<khiremat at redhat.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This looks
like the hang because stderr buffer filled up
>>>>>>>>>>>>>>>>> with errors
messages and no one reading it.
>>>>>>>>>>>>>>>>> I think
this issue is fixed in latest releases. As a
>>>>>>>>>>>>>>>>> workaround,
you can do following and check if it works.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
Prerequisite:
>>>>>>>>>>>>>>>>>  rsync
version should be > 3.1.0
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Workaround:
>>>>>>>>>>>>>>>>> gluster
volume geo-replication <MASTERVOL>
>>>>>>>>>>>>>>>>>
<SLAVEHOST>::<SLAVEVOL> config rsync-options "--ignore-
>>>>>>>>>>>>>>>>>
missing-args"
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Kotresh HR
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, May
30, 2019 at 5:39 PM deepu srinivasan <
>>>>>>>>>>>>>>>>> sdeepugd at
gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi
>>>>>>>>>>>>>>>>>> We were
evaluating Gluster geo Replication between two
>>>>>>>>>>>>>>>>>> DCs one
is in US west and one is in US east. We took multiple trials for
>>>>>>>>>>>>>>>>>>
different file size.
>>>>>>>>>>>>>>>>>> The Geo
Replication tends to stop replicating but while
>>>>>>>>>>>>>>>>>>
checking the status it appears to be in Active state. But the slave volume
>>>>>>>>>>>>>>>>>> did not
increase in size.
>>>>>>>>>>>>>>>>>> So we
have restarted the geo-replication session and
>>>>>>>>>>>>>>>>>> checked
the status. The status was in an active state and it was in History
>>>>>>>>>>>>>>>>>> Crawl
for a long time. We have enabled the DEBUG mode in logging and
>>>>>>>>>>>>>>>>>> checked
for any error.
>>>>>>>>>>>>>>>>>> There
was around 2000 file appeared for syncing
>>>>>>>>>>>>>>>>>>
candidate. The Rsync process starts but the rsync did not happen in the
>>>>>>>>>>>>>>>>>> slave
volume. Every time the rsync process appears in the "ps auxxx" list
>>>>>>>>>>>>>>>>>> but the
replication did not happen in the slave end. What would be the
>>>>>>>>>>>>>>>>>> cause
of this problem? Is there anyway to debug it?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> We have
also checked the strace of the rync program.
>>>>>>>>>>>>>>>>>> it
displays something like this
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
"write(2, "rsync: link_stat \"/tmp/gsyncd-au"..., 128"
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> We are
using the below specs
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Gluster
version - 4.1.7
>>>>>>>>>>>>>>>>>> Sync
mode - rsync
>>>>>>>>>>>>>>>>>> Volume
- 1x3 in each end (master and slave)
>>>>>>>>>>>>>>>>>>
Intranet Bandwidth - 10 Gig
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Thanks and
Regards,
>>>>>>>>>>>>>>>>> Kotresh H R
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Thanks and Regards,
>>>>>>>>>>>>>> Kotresh H R
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Thanks and Regards,
>>>>>>>>>>>> Kotresh H R
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Thanks and Regards,
>>>>>>>>>> Kotresh H R
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Thanks and Regards,
>>>>>> Kotresh H R
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Thanks and Regards,
>>>> Kotresh H R
>>>>
>>>
-- 
Thanks and Regards,
Kotresh H R
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190606/5543c358/attachment-0001.html>

Gluster users - Jun 2019 - Geo Replication stops replicating

[Gluster-users] Geo Replication stops replicating

[Gluster-users] Geo Replication stops replicating

[Gluster-users] Geo Replication stops replicating