Hi Kotresh, Sunny Found this log in the slave machine.> [2019-06-05 08:49:10.632583] I [MSGID: 106488] > [glusterd-handler.c:1559:__glusterd_handle_cli_get_volume] 0-management: > Received get vol req > > The message "I [MSGID: 106488] > [glusterd-handler.c:1559:__glusterd_handle_cli_get_volume] 0-management: > Received get vol req" repeated 2 times between [2019-06-05 08:49:10.632583] > and [2019-06-05 08:49:10.670863] > > The message "I [MSGID: 106496] > [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd: Received > mount req" repeated 34 times between [2019-06-05 08:48:41.005398] and > [2019-06-05 08:50:37.254063] > > The message "E [MSGID: 106061] > [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management: 'option > mountbroker-root' missing in glusterd vol file" repeated 34 times between > [2019-06-05 08:48:41.005434] and [2019-06-05 08:50:37.254079] > > The message "W [MSGID: 106176] > [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management: unsuccessful > mount request [No such file or directory]" repeated 34 times between > [2019-06-05 08:48:41.005444] and [2019-06-05 08:50:37.254080] > > [2019-06-05 08:50:46.361347] I [MSGID: 106496] > [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd: Received > mount req > > [2019-06-05 08:50:46.361384] E [MSGID: 106061] > [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management: 'option > mountbroker-root' missing in glusterd vol file > > [2019-06-05 08:50:46.361419] W [MSGID: 106176] > [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management: unsuccessful > mount request [No such file or directory] > > The message "I [MSGID: 106496] > [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd: Received > mount req" repeated 33 times between [2019-06-05 08:50:46.361347] and > [2019-06-05 08:52:34.019741] > > The message "E [MSGID: 106061] > [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management: 'option > mountbroker-root' missing in glusterd vol file" repeated 33 times between > [2019-06-05 08:50:46.361384] and [2019-06-05 08:52:34.019757] > > The message "W [MSGID: 106176] > [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management: unsuccessful > mount request [No such file or directory]" repeated 33 times between > [2019-06-05 08:50:46.361419] and [2019-06-05 08:52:34.019758] > > [2019-06-05 08:52:44.426839] I [MSGID: 106496] > [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd: Received > mount req > > [2019-06-05 08:52:44.426886] E [MSGID: 106061] > [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management: 'option > mountbroker-root' missing in glusterd vol file > > [2019-06-05 08:52:44.426896] W [MSGID: 106176] > [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management: unsuccessful > mount request [No such file or directory] >On Wed, Jun 5, 2019 at 1:06 AM deepu srinivasan <sdeepugd at gmail.com> wrote:> Thankyou Kotresh > > On Tue, Jun 4, 2019, 11:20 PM Kotresh Hiremath Ravishankar < > khiremat at redhat.com> wrote: > >> Ccing Sunny, who was investing similar issue. >> >> On Tue, Jun 4, 2019 at 5:46 PM deepu srinivasan <sdeepugd at gmail.com> >> wrote: >> >>> Have already added the path in bashrc . Still in faulty state >>> >>> On Tue, Jun 4, 2019, 5:27 PM Kotresh Hiremath Ravishankar < >>> khiremat at redhat.com> wrote: >>> >>>> could you please try adding /usr/sbin to $PATH for user 'sas'? If it's >>>> bash, add 'export PATH=/usr/sbin:$PATH' in >>>> /home/sas/.bashrc >>>> >>>> On Tue, Jun 4, 2019 at 5:24 PM deepu srinivasan <sdeepugd at gmail.com> >>>> wrote: >>>> >>>>> Hi Kortesh >>>>> Please find the logs of the above error >>>>> *Master log snippet* >>>>> >>>>>> [2019-06-04 11:52:09.254731] I [resource(worker >>>>>> /home/sas/gluster/data/code-misc):1379:connect_remote] SSH: Initializing >>>>>> SSH connection between master and slave... >>>>>> [2019-06-04 11:52:09.308923] D [repce(worker >>>>>> /home/sas/gluster/data/code-misc):196:push] RepceClient: call >>>>>> 89724:139652759443264:1559649129.31 __repce_version__() ... >>>>>> [2019-06-04 11:52:09.602792] E [syncdutils(worker >>>>>> /home/sas/gluster/data/code-misc):311:log_raise_exception] <top>: >>>>>> connection to peer is broken >>>>>> [2019-06-04 11:52:09.603312] E [syncdutils(worker >>>>>> /home/sas/gluster/data/code-misc):805:errlog] Popen: command returned error >>>>>> cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i >>>>>> /var/lib/ glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S >>>>>> /tmp/gsyncd-aux-ssh-4aL2tc/d893f66e0addc32f7d0080bb503f5185.sock >>>>>> sas at 192.168.185.107 /usr/libexec/glusterfs/gsyncd slave code-misc >>>>>> sas@ 192.168.185.107::code-misc --master-node 192.168.185.106 >>>>>> --master-node-id 851b64d0-d885-4ae9-9b38-ab5b15db0fec --master-brick >>>>>> /home/sas/gluster/data/code-misc --local-node 192.168.185.122 --local-node- >>>>>> id bcaa7af6-c3a1-4411-8e99-4ebecb32eb6a --slave-timeout 120 >>>>>> --slave-log-level DEBUG --slave-gluster-log-level INFO >>>>>> --slave-gluster-command-dir /usr/sbin error=1 >>>>>> [2019-06-04 11:52:09.614996] I [repce(agent >>>>>> /home/sas/gluster/data/code-misc):97:service_loop] RepceServer: terminating >>>>>> on reaching EOF. >>>>>> [2019-06-04 11:52:09.615545] D [monitor(monitor):271:monitor] >>>>>> Monitor: worker(/home/sas/gluster/data/code-misc) connected >>>>>> [2019-06-04 11:52:09.616528] I [monitor(monitor):278:monitor] >>>>>> Monitor: worker died in startup phase brick=/home/sas/gluster/data/code-misc >>>>>> [2019-06-04 11:52:09.619391] I >>>>>> [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status >>>>>> Change status=Faulty >>>>>> >>>>> >>>>> *Slave log snippet* >>>>> >>>>>> [2019-06-04 11:50:09.782668] E [syncdutils(slave >>>>>> 192.168.185.106/home/sas/gluster/data/code-misc):809:logerr] Popen: >>>>>> /usr/sbin/gluster> 2 : failed with this errno (No such file or directory) >>>>>> [2019-06-04 11:50:11.188167] W [gsyncd(slave >>>>>> 192.168.185.125/home/sas/gluster/data/code-misc):305:main] <top>: >>>>>> Session config file not exists, using the default config >>>>>> path=/var/lib/glusterd/geo-replication/code-misc_192.168.185.107_code-misc/gsyncd.conf >>>>>> [2019-06-04 11:50:11.201070] I [resource(slave >>>>>> 192.168.185.125/home/sas/gluster/data/code-misc):1098:connect] >>>>>> GLUSTER: Mounting gluster volume locally... >>>>>> [2019-06-04 11:50:11.271231] E [resource(slave >>>>>> 192.168.185.125/home/sas/gluster/data/code-misc):1006:handle_mounter] >>>>>> MountbrokerMounter: glusterd answered mnt>>>>>> [2019-06-04 11:50:11.271998] E [syncdutils(slave >>>>>> 192.168.185.125/home/sas/gluster/data/code-misc):805:errlog] Popen: >>>>>> command returned error cmd=/usr/sbin/gluster --remote-host=localhost >>>>>> system:: mount sas user-map-root=sas aux-gfid-mount acl log-level=INFO >>>>>> log-file=/var/log/glusterfs/geo-replication-slaves/code-misc_192.168.185.107_code-misc/mnt-192.168.185.125-home-sas-gluster-data-code-misc.log >>>>>> volfile-server=localhost volfile-id=code-misc client-pid=-1 error=1 >>>>>> [2019-06-04 11:50:11.272113] E [syncdutils(slave >>>>>> 192.168.185.125/home/sas/gluster/data/code-misc):809:logerr] Popen: >>>>>> /usr/sbin/gluster> 2 : failed with this errno (No such file or directory) >>>>> >>>>> >>>>> On Tue, Jun 4, 2019 at 5:10 PM deepu srinivasan <sdeepugd at gmail.com> >>>>> wrote: >>>>> >>>>>> Hi >>>>>> As discussed I have upgraded gluster from 4.1 to 6.2 version. But the >>>>>> Geo replication failed to start. >>>>>> Stays in faulty state >>>>>> >>>>>> On Fri, May 31, 2019, 5:32 PM deepu srinivasan <sdeepugd at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Checked the data. It remains in 2708. No progress. >>>>>>> >>>>>>> On Fri, May 31, 2019 at 4:36 PM Kotresh Hiremath Ravishankar < >>>>>>> khiremat at redhat.com> wrote: >>>>>>> >>>>>>>> That means it could be working and the defunct process might be >>>>>>>> some old zombie one. Could you check, that data progress ? >>>>>>>> >>>>>>>> On Fri, May 31, 2019 at 4:29 PM deepu srinivasan < >>>>>>>> sdeepugd at gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi >>>>>>>>> When i change the rsync option the rsync process doesnt seem to >>>>>>>>> start . Only a defunt process is listed in ps aux. Only when i set rsync >>>>>>>>> option to " " and restart all the process the rsync process is listed in ps >>>>>>>>> aux. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, May 31, 2019 at 4:23 PM Kotresh Hiremath Ravishankar < >>>>>>>>> khiremat at redhat.com> wrote: >>>>>>>>> >>>>>>>>>> Yes, rsync config option should have fixed this issue. >>>>>>>>>> >>>>>>>>>> Could you share the output of the following? >>>>>>>>>> >>>>>>>>>> 1. gluster volume geo-replication <MASTERVOL> >>>>>>>>>> <SLAVEHOST>::<SLAVEVOL> config rsync-options >>>>>>>>>> 2. ps -ef | grep rsync >>>>>>>>>> >>>>>>>>>> On Fri, May 31, 2019 at 4:11 PM deepu srinivasan < >>>>>>>>>> sdeepugd at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Done. >>>>>>>>>>> We got the following result . >>>>>>>>>>> >>>>>>>>>>>> 1559298781.338234 write(2, "rsync: link_stat >>>>>>>>>>>> \"/tmp/gsyncd-aux-mount-EEJ_sY/.gfid/3fa6aed8-802e-4efe-9903-8bc171176d88\" >>>>>>>>>>>> failed: No such file or directory (2)", 128 >>>>>>>>>>> >>>>>>>>>>> seems like a file is missing ? >>>>>>>>>>> >>>>>>>>>>> On Fri, May 31, 2019 at 3:25 PM Kotresh Hiremath Ravishankar < >>>>>>>>>>> khiremat at redhat.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> Could you take the strace with with more string size? The >>>>>>>>>>>> argument strings are truncated. >>>>>>>>>>>> >>>>>>>>>>>> strace -s 500 -ttt -T -p <rsync pid> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, May 31, 2019 at 3:17 PM deepu srinivasan < >>>>>>>>>>>> sdeepugd at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Kotresh >>>>>>>>>>>>> The above-mentioned work around did not work properly. >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, May 31, 2019 at 3:16 PM deepu srinivasan < >>>>>>>>>>>>> sdeepugd at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Kotresh >>>>>>>>>>>>>> We have tried the above-mentioned rsync option and we are >>>>>>>>>>>>>> planning to have the version upgrade to 6.0. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, May 31, 2019 at 11:04 AM Kotresh Hiremath Ravishankar >>>>>>>>>>>>>> <khiremat at redhat.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This looks like the hang because stderr buffer filled up >>>>>>>>>>>>>>> with errors messages and no one reading it. >>>>>>>>>>>>>>> I think this issue is fixed in latest releases. As a >>>>>>>>>>>>>>> workaround, you can do following and check if it works. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Prerequisite: >>>>>>>>>>>>>>> rsync version should be > 3.1.0 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Workaround: >>>>>>>>>>>>>>> gluster volume geo-replication <MASTERVOL> >>>>>>>>>>>>>>> <SLAVEHOST>::<SLAVEVOL> config rsync-options "--ignore- >>>>>>>>>>>>>>> missing-args" >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Kotresh HR >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thu, May 30, 2019 at 5:39 PM deepu srinivasan < >>>>>>>>>>>>>>> sdeepugd at gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi >>>>>>>>>>>>>>>> We were evaluating Gluster geo Replication between two DCs >>>>>>>>>>>>>>>> one is in US west and one is in US east. We took multiple trials for >>>>>>>>>>>>>>>> different file size. >>>>>>>>>>>>>>>> The Geo Replication tends to stop replicating but while >>>>>>>>>>>>>>>> checking the status it appears to be in Active state. But the slave volume >>>>>>>>>>>>>>>> did not increase in size. >>>>>>>>>>>>>>>> So we have restarted the geo-replication session and >>>>>>>>>>>>>>>> checked the status. The status was in an active state and it was in History >>>>>>>>>>>>>>>> Crawl for a long time. We have enabled the DEBUG mode in logging and >>>>>>>>>>>>>>>> checked for any error. >>>>>>>>>>>>>>>> There was around 2000 file appeared for syncing candidate. >>>>>>>>>>>>>>>> The Rsync process starts but the rsync did not happen in the slave volume. >>>>>>>>>>>>>>>> Every time the rsync process appears in the "ps auxxx" list but the >>>>>>>>>>>>>>>> replication did not happen in the slave end. What would be the cause of >>>>>>>>>>>>>>>> this problem? Is there anyway to debug it? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> We have also checked the strace of the rync program. >>>>>>>>>>>>>>>> it displays something like this >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> "write(2, "rsync: link_stat \"/tmp/gsyncd-au"..., 128" >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> We are using the below specs >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Gluster version - 4.1.7 >>>>>>>>>>>>>>>> Sync mode - rsync >>>>>>>>>>>>>>>> Volume - 1x3 in each end (master and slave) >>>>>>>>>>>>>>>> Intranet Bandwidth - 10 Gig >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>> Kotresh H R >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>> Kotresh H R >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Thanks and Regards, >>>>>>>>>> Kotresh H R >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Thanks and Regards, >>>>>>>> Kotresh H R >>>>>>>> >>>>>>> >>>> >>>> -- >>>> Thanks and Regards, >>>> Kotresh H R >>>> >>> >> >> -- >> Thanks and Regards, >> Kotresh H R >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190605/d351006c/attachment-0001.html>
Hi Kotresh, Sunny I Have mailed the logs I found in one of the slave machines. Is there anything to do with permission? Please help. On Wed, Jun 5, 2019 at 2:28 PM deepu srinivasan <sdeepugd at gmail.com> wrote:> Hi Kotresh, Sunny > Found this log in the slave machine. > >> [2019-06-05 08:49:10.632583] I [MSGID: 106488] >> [glusterd-handler.c:1559:__glusterd_handle_cli_get_volume] 0-management: >> Received get vol req >> >> The message "I [MSGID: 106488] >> [glusterd-handler.c:1559:__glusterd_handle_cli_get_volume] 0-management: >> Received get vol req" repeated 2 times between [2019-06-05 08:49:10.632583] >> and [2019-06-05 08:49:10.670863] >> >> The message "I [MSGID: 106496] >> [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd: Received >> mount req" repeated 34 times between [2019-06-05 08:48:41.005398] and >> [2019-06-05 08:50:37.254063] >> >> The message "E [MSGID: 106061] >> [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management: 'option >> mountbroker-root' missing in glusterd vol file" repeated 34 times between >> [2019-06-05 08:48:41.005434] and [2019-06-05 08:50:37.254079] >> >> The message "W [MSGID: 106176] >> [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management: unsuccessful >> mount request [No such file or directory]" repeated 34 times between >> [2019-06-05 08:48:41.005444] and [2019-06-05 08:50:37.254080] >> >> [2019-06-05 08:50:46.361347] I [MSGID: 106496] >> [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd: Received >> mount req >> >> [2019-06-05 08:50:46.361384] E [MSGID: 106061] >> [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management: 'option >> mountbroker-root' missing in glusterd vol file >> >> [2019-06-05 08:50:46.361419] W [MSGID: 106176] >> [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management: unsuccessful >> mount request [No such file or directory] >> >> The message "I [MSGID: 106496] >> [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd: Received >> mount req" repeated 33 times between [2019-06-05 08:50:46.361347] and >> [2019-06-05 08:52:34.019741] >> >> The message "E [MSGID: 106061] >> [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management: 'option >> mountbroker-root' missing in glusterd vol file" repeated 33 times between >> [2019-06-05 08:50:46.361384] and [2019-06-05 08:52:34.019757] >> >> The message "W [MSGID: 106176] >> [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management: unsuccessful >> mount request [No such file or directory]" repeated 33 times between >> [2019-06-05 08:50:46.361419] and [2019-06-05 08:52:34.019758] >> >> [2019-06-05 08:52:44.426839] I [MSGID: 106496] >> [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd: Received >> mount req >> >> [2019-06-05 08:52:44.426886] E [MSGID: 106061] >> [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management: 'option >> mountbroker-root' missing in glusterd vol file >> >> [2019-06-05 08:52:44.426896] W [MSGID: 106176] >> [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management: unsuccessful >> mount request [No such file or directory] >> > > On Wed, Jun 5, 2019 at 1:06 AM deepu srinivasan <sdeepugd at gmail.com> > wrote: > >> Thankyou Kotresh >> >> On Tue, Jun 4, 2019, 11:20 PM Kotresh Hiremath Ravishankar < >> khiremat at redhat.com> wrote: >> >>> Ccing Sunny, who was investing similar issue. >>> >>> On Tue, Jun 4, 2019 at 5:46 PM deepu srinivasan <sdeepugd at gmail.com> >>> wrote: >>> >>>> Have already added the path in bashrc . Still in faulty state >>>> >>>> On Tue, Jun 4, 2019, 5:27 PM Kotresh Hiremath Ravishankar < >>>> khiremat at redhat.com> wrote: >>>> >>>>> could you please try adding /usr/sbin to $PATH for user 'sas'? If it's >>>>> bash, add 'export PATH=/usr/sbin:$PATH' in >>>>> /home/sas/.bashrc >>>>> >>>>> On Tue, Jun 4, 2019 at 5:24 PM deepu srinivasan <sdeepugd at gmail.com> >>>>> wrote: >>>>> >>>>>> Hi Kortesh >>>>>> Please find the logs of the above error >>>>>> *Master log snippet* >>>>>> >>>>>>> [2019-06-04 11:52:09.254731] I [resource(worker >>>>>>> /home/sas/gluster/data/code-misc):1379:connect_remote] SSH: Initializing >>>>>>> SSH connection between master and slave... >>>>>>> [2019-06-04 11:52:09.308923] D [repce(worker >>>>>>> /home/sas/gluster/data/code-misc):196:push] RepceClient: call >>>>>>> 89724:139652759443264:1559649129.31 __repce_version__() ... >>>>>>> [2019-06-04 11:52:09.602792] E [syncdutils(worker >>>>>>> /home/sas/gluster/data/code-misc):311:log_raise_exception] <top>: >>>>>>> connection to peer is broken >>>>>>> [2019-06-04 11:52:09.603312] E [syncdutils(worker >>>>>>> /home/sas/gluster/data/code-misc):805:errlog] Popen: command returned error >>>>>>> cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i >>>>>>> /var/lib/ glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S >>>>>>> /tmp/gsyncd-aux-ssh-4aL2tc/d893f66e0addc32f7d0080bb503f5185.sock >>>>>>> sas at 192.168.185.107 /usr/libexec/glusterfs/gsyncd slave code-misc >>>>>>> sas@ 192.168.185.107::code-misc --master-node 192.168.185.106 >>>>>>> --master-node-id 851b64d0-d885-4ae9-9b38-ab5b15db0fec --master-brick >>>>>>> /home/sas/gluster/data/code-misc --local-node 192.168.185.122 --local-node- >>>>>>> id bcaa7af6-c3a1-4411-8e99-4ebecb32eb6a --slave-timeout 120 >>>>>>> --slave-log-level DEBUG --slave-gluster-log-level INFO >>>>>>> --slave-gluster-command-dir /usr/sbin error=1 >>>>>>> [2019-06-04 11:52:09.614996] I [repce(agent >>>>>>> /home/sas/gluster/data/code-misc):97:service_loop] RepceServer: terminating >>>>>>> on reaching EOF. >>>>>>> [2019-06-04 11:52:09.615545] D [monitor(monitor):271:monitor] >>>>>>> Monitor: worker(/home/sas/gluster/data/code-misc) connected >>>>>>> [2019-06-04 11:52:09.616528] I [monitor(monitor):278:monitor] >>>>>>> Monitor: worker died in startup phase brick=/home/sas/gluster/data/code-misc >>>>>>> [2019-06-04 11:52:09.619391] I >>>>>>> [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status >>>>>>> Change status=Faulty >>>>>>> >>>>>> >>>>>> *Slave log snippet* >>>>>> >>>>>>> [2019-06-04 11:50:09.782668] E [syncdutils(slave >>>>>>> 192.168.185.106/home/sas/gluster/data/code-misc):809:logerr] Popen: >>>>>>> /usr/sbin/gluster> 2 : failed with this errno (No such file or directory) >>>>>>> [2019-06-04 11:50:11.188167] W [gsyncd(slave >>>>>>> 192.168.185.125/home/sas/gluster/data/code-misc):305:main] <top>: >>>>>>> Session config file not exists, using the default config >>>>>>> path=/var/lib/glusterd/geo-replication/code-misc_192.168.185.107_code-misc/gsyncd.conf >>>>>>> [2019-06-04 11:50:11.201070] I [resource(slave >>>>>>> 192.168.185.125/home/sas/gluster/data/code-misc):1098:connect] >>>>>>> GLUSTER: Mounting gluster volume locally... >>>>>>> [2019-06-04 11:50:11.271231] E [resource(slave >>>>>>> 192.168.185.125/home/sas/gluster/data/code-misc):1006:handle_mounter] >>>>>>> MountbrokerMounter: glusterd answered mnt>>>>>>> [2019-06-04 11:50:11.271998] E [syncdutils(slave >>>>>>> 192.168.185.125/home/sas/gluster/data/code-misc):805:errlog] Popen: >>>>>>> command returned error cmd=/usr/sbin/gluster --remote-host=localhost >>>>>>> system:: mount sas user-map-root=sas aux-gfid-mount acl log-level=INFO >>>>>>> log-file=/var/log/glusterfs/geo-replication-slaves/code-misc_192.168.185.107_code-misc/mnt-192.168.185.125-home-sas-gluster-data-code-misc.log >>>>>>> volfile-server=localhost volfile-id=code-misc client-pid=-1 error=1 >>>>>>> [2019-06-04 11:50:11.272113] E [syncdutils(slave >>>>>>> 192.168.185.125/home/sas/gluster/data/code-misc):809:logerr] Popen: >>>>>>> /usr/sbin/gluster> 2 : failed with this errno (No such file or directory) >>>>>> >>>>>> >>>>>> On Tue, Jun 4, 2019 at 5:10 PM deepu srinivasan <sdeepugd at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi >>>>>>> As discussed I have upgraded gluster from 4.1 to 6.2 version. But >>>>>>> the Geo replication failed to start. >>>>>>> Stays in faulty state >>>>>>> >>>>>>> On Fri, May 31, 2019, 5:32 PM deepu srinivasan <sdeepugd at gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Checked the data. It remains in 2708. No progress. >>>>>>>> >>>>>>>> On Fri, May 31, 2019 at 4:36 PM Kotresh Hiremath Ravishankar < >>>>>>>> khiremat at redhat.com> wrote: >>>>>>>> >>>>>>>>> That means it could be working and the defunct process might be >>>>>>>>> some old zombie one. Could you check, that data progress ? >>>>>>>>> >>>>>>>>> On Fri, May 31, 2019 at 4:29 PM deepu srinivasan < >>>>>>>>> sdeepugd at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hi >>>>>>>>>> When i change the rsync option the rsync process doesnt seem to >>>>>>>>>> start . Only a defunt process is listed in ps aux. Only when i set rsync >>>>>>>>>> option to " " and restart all the process the rsync process is listed in ps >>>>>>>>>> aux. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, May 31, 2019 at 4:23 PM Kotresh Hiremath Ravishankar < >>>>>>>>>> khiremat at redhat.com> wrote: >>>>>>>>>> >>>>>>>>>>> Yes, rsync config option should have fixed this issue. >>>>>>>>>>> >>>>>>>>>>> Could you share the output of the following? >>>>>>>>>>> >>>>>>>>>>> 1. gluster volume geo-replication <MASTERVOL> >>>>>>>>>>> <SLAVEHOST>::<SLAVEVOL> config rsync-options >>>>>>>>>>> 2. ps -ef | grep rsync >>>>>>>>>>> >>>>>>>>>>> On Fri, May 31, 2019 at 4:11 PM deepu srinivasan < >>>>>>>>>>> sdeepugd at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Done. >>>>>>>>>>>> We got the following result . >>>>>>>>>>>> >>>>>>>>>>>>> 1559298781.338234 write(2, "rsync: link_stat >>>>>>>>>>>>> \"/tmp/gsyncd-aux-mount-EEJ_sY/.gfid/3fa6aed8-802e-4efe-9903-8bc171176d88\" >>>>>>>>>>>>> failed: No such file or directory (2)", 128 >>>>>>>>>>>> >>>>>>>>>>>> seems like a file is missing ? >>>>>>>>>>>> >>>>>>>>>>>> On Fri, May 31, 2019 at 3:25 PM Kotresh Hiremath Ravishankar < >>>>>>>>>>>> khiremat at redhat.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> Could you take the strace with with more string size? The >>>>>>>>>>>>> argument strings are truncated. >>>>>>>>>>>>> >>>>>>>>>>>>> strace -s 500 -ttt -T -p <rsync pid> >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, May 31, 2019 at 3:17 PM deepu srinivasan < >>>>>>>>>>>>> sdeepugd at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Kotresh >>>>>>>>>>>>>> The above-mentioned work around did not work properly. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, May 31, 2019 at 3:16 PM deepu srinivasan < >>>>>>>>>>>>>> sdeepugd at gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Kotresh >>>>>>>>>>>>>>> We have tried the above-mentioned rsync option and we are >>>>>>>>>>>>>>> planning to have the version upgrade to 6.0. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, May 31, 2019 at 11:04 AM Kotresh Hiremath >>>>>>>>>>>>>>> Ravishankar <khiremat at redhat.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> This looks like the hang because stderr buffer filled up >>>>>>>>>>>>>>>> with errors messages and no one reading it. >>>>>>>>>>>>>>>> I think this issue is fixed in latest releases. As a >>>>>>>>>>>>>>>> workaround, you can do following and check if it works. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Prerequisite: >>>>>>>>>>>>>>>> rsync version should be > 3.1.0 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Workaround: >>>>>>>>>>>>>>>> gluster volume geo-replication <MASTERVOL> >>>>>>>>>>>>>>>> <SLAVEHOST>::<SLAVEVOL> config rsync-options "--ignore- >>>>>>>>>>>>>>>> missing-args" >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Kotresh HR >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Thu, May 30, 2019 at 5:39 PM deepu srinivasan < >>>>>>>>>>>>>>>> sdeepugd at gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi >>>>>>>>>>>>>>>>> We were evaluating Gluster geo Replication between two DCs >>>>>>>>>>>>>>>>> one is in US west and one is in US east. We took multiple trials for >>>>>>>>>>>>>>>>> different file size. >>>>>>>>>>>>>>>>> The Geo Replication tends to stop replicating but while >>>>>>>>>>>>>>>>> checking the status it appears to be in Active state. But the slave volume >>>>>>>>>>>>>>>>> did not increase in size. >>>>>>>>>>>>>>>>> So we have restarted the geo-replication session and >>>>>>>>>>>>>>>>> checked the status. The status was in an active state and it was in History >>>>>>>>>>>>>>>>> Crawl for a long time. We have enabled the DEBUG mode in logging and >>>>>>>>>>>>>>>>> checked for any error. >>>>>>>>>>>>>>>>> There was around 2000 file appeared for syncing candidate. >>>>>>>>>>>>>>>>> The Rsync process starts but the rsync did not happen in the slave volume. >>>>>>>>>>>>>>>>> Every time the rsync process appears in the "ps auxxx" list but the >>>>>>>>>>>>>>>>> replication did not happen in the slave end. What would be the cause of >>>>>>>>>>>>>>>>> this problem? Is there anyway to debug it? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> We have also checked the strace of the rync program. >>>>>>>>>>>>>>>>> it displays something like this >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> "write(2, "rsync: link_stat \"/tmp/gsyncd-au"..., 128" >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> We are using the below specs >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Gluster version - 4.1.7 >>>>>>>>>>>>>>>>> Sync mode - rsync >>>>>>>>>>>>>>>>> Volume - 1x3 in each end (master and slave) >>>>>>>>>>>>>>>>> Intranet Bandwidth - 10 Gig >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>> Kotresh H R >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>> Kotresh H R >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Thanks and Regards, >>>>>>>>>>> Kotresh H R >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Thanks and Regards, >>>>>>>>> Kotresh H R >>>>>>>>> >>>>>>>> >>>>> >>>>> -- >>>>> Thanks and Regards, >>>>> Kotresh H R >>>>> >>>> >>> >>> -- >>> Thanks and Regards, >>> Kotresh H R >>> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190606/6b155846/attachment-0001.html>
Kotresh Hiremath Ravishankar
2019-Jun-06 04:58 UTC
[Gluster-users] Geo Replication stops replicating
Hi, I think the steps to setup non-root geo-rep is not followed properly. The following entry is missing in glusterd vol file which is required. The message "E [MSGID: 106061] [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management: 'option mountbroker-root' missing in glusterd vol file" repeated 33 times between [2019-06-05 08:50:46.361384] and [2019-06-05 08:52:34.019757] Could you please the steps from below? https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html-single/administration_guide/index#Setting_Up_the_Environment_for_a_Secure_Geo-replication_Slave And let us know if you still face the issue. On Thu, Jun 6, 2019 at 10:24 AM deepu srinivasan <sdeepugd at gmail.com> wrote:> Hi Kotresh, Sunny > I Have mailed the logs I found in one of the slave machines. Is there > anything to do with permission? Please help. > > On Wed, Jun 5, 2019 at 2:28 PM deepu srinivasan <sdeepugd at gmail.com> > wrote: > >> Hi Kotresh, Sunny >> Found this log in the slave machine. >> >>> [2019-06-05 08:49:10.632583] I [MSGID: 106488] >>> [glusterd-handler.c:1559:__glusterd_handle_cli_get_volume] 0-management: >>> Received get vol req >>> >>> The message "I [MSGID: 106488] >>> [glusterd-handler.c:1559:__glusterd_handle_cli_get_volume] 0-management: >>> Received get vol req" repeated 2 times between [2019-06-05 08:49:10.632583] >>> and [2019-06-05 08:49:10.670863] >>> >>> The message "I [MSGID: 106496] >>> [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd: Received >>> mount req" repeated 34 times between [2019-06-05 08:48:41.005398] and >>> [2019-06-05 08:50:37.254063] >>> >>> The message "E [MSGID: 106061] >>> [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management: 'option >>> mountbroker-root' missing in glusterd vol file" repeated 34 times between >>> [2019-06-05 08:48:41.005434] and [2019-06-05 08:50:37.254079] >>> >>> The message "W [MSGID: 106176] >>> [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management: unsuccessful >>> mount request [No such file or directory]" repeated 34 times between >>> [2019-06-05 08:48:41.005444] and [2019-06-05 08:50:37.254080] >>> >>> [2019-06-05 08:50:46.361347] I [MSGID: 106496] >>> [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd: Received >>> mount req >>> >>> [2019-06-05 08:50:46.361384] E [MSGID: 106061] >>> [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management: 'option >>> mountbroker-root' missing in glusterd vol file >>> >>> [2019-06-05 08:50:46.361419] W [MSGID: 106176] >>> [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management: unsuccessful >>> mount request [No such file or directory] >>> >>> The message "I [MSGID: 106496] >>> [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd: Received >>> mount req" repeated 33 times between [2019-06-05 08:50:46.361347] and >>> [2019-06-05 08:52:34.019741] >>> >>> The message "E [MSGID: 106061] >>> [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management: 'option >>> mountbroker-root' missing in glusterd vol file" repeated 33 times between >>> [2019-06-05 08:50:46.361384] and [2019-06-05 08:52:34.019757] >>> >>> The message "W [MSGID: 106176] >>> [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management: unsuccessful >>> mount request [No such file or directory]" repeated 33 times between >>> [2019-06-05 08:50:46.361419] and [2019-06-05 08:52:34.019758] >>> >>> [2019-06-05 08:52:44.426839] I [MSGID: 106496] >>> [glusterd-handler.c:3187:__glusterd_handle_mount] 0-glusterd: Received >>> mount req >>> >>> [2019-06-05 08:52:44.426886] E [MSGID: 106061] >>> [glusterd-mountbroker.c:555:glusterd_do_mount] 0-management: 'option >>> mountbroker-root' missing in glusterd vol file >>> >>> [2019-06-05 08:52:44.426896] W [MSGID: 106176] >>> [glusterd-mountbroker.c:719:glusterd_do_mount] 0-management: unsuccessful >>> mount request [No such file or directory] >>> >> >> On Wed, Jun 5, 2019 at 1:06 AM deepu srinivasan <sdeepugd at gmail.com> >> wrote: >> >>> Thankyou Kotresh >>> >>> On Tue, Jun 4, 2019, 11:20 PM Kotresh Hiremath Ravishankar < >>> khiremat at redhat.com> wrote: >>> >>>> Ccing Sunny, who was investing similar issue. >>>> >>>> On Tue, Jun 4, 2019 at 5:46 PM deepu srinivasan <sdeepugd at gmail.com> >>>> wrote: >>>> >>>>> Have already added the path in bashrc . Still in faulty state >>>>> >>>>> On Tue, Jun 4, 2019, 5:27 PM Kotresh Hiremath Ravishankar < >>>>> khiremat at redhat.com> wrote: >>>>> >>>>>> could you please try adding /usr/sbin to $PATH for user 'sas'? If >>>>>> it's bash, add 'export PATH=/usr/sbin:$PATH' in >>>>>> /home/sas/.bashrc >>>>>> >>>>>> On Tue, Jun 4, 2019 at 5:24 PM deepu srinivasan <sdeepugd at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi Kortesh >>>>>>> Please find the logs of the above error >>>>>>> *Master log snippet* >>>>>>> >>>>>>>> [2019-06-04 11:52:09.254731] I [resource(worker >>>>>>>> /home/sas/gluster/data/code-misc):1379:connect_remote] SSH: Initializing >>>>>>>> SSH connection between master and slave... >>>>>>>> [2019-06-04 11:52:09.308923] D [repce(worker >>>>>>>> /home/sas/gluster/data/code-misc):196:push] RepceClient: call >>>>>>>> 89724:139652759443264:1559649129.31 __repce_version__() ... >>>>>>>> [2019-06-04 11:52:09.602792] E [syncdutils(worker >>>>>>>> /home/sas/gluster/data/code-misc):311:log_raise_exception] <top>: >>>>>>>> connection to peer is broken >>>>>>>> [2019-06-04 11:52:09.603312] E [syncdutils(worker >>>>>>>> /home/sas/gluster/data/code-misc):805:errlog] Popen: command returned error >>>>>>>> cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i >>>>>>>> /var/lib/ glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S >>>>>>>> /tmp/gsyncd-aux-ssh-4aL2tc/d893f66e0addc32f7d0080bb503f5185.sock >>>>>>>> sas at 192.168.185.107 /usr/libexec/glusterfs/gsyncd slave code-misc >>>>>>>> sas@ 192.168.185.107::code-misc --master-node 192.168.185.106 >>>>>>>> --master-node-id 851b64d0-d885-4ae9-9b38-ab5b15db0fec --master-brick >>>>>>>> /home/sas/gluster/data/code-misc --local-node 192.168.185.122 --local-node- >>>>>>>> id bcaa7af6-c3a1-4411-8e99-4ebecb32eb6a --slave-timeout 120 >>>>>>>> --slave-log-level DEBUG --slave-gluster-log-level INFO >>>>>>>> --slave-gluster-command-dir /usr/sbin error=1 >>>>>>>> [2019-06-04 11:52:09.614996] I [repce(agent >>>>>>>> /home/sas/gluster/data/code-misc):97:service_loop] RepceServer: terminating >>>>>>>> on reaching EOF. >>>>>>>> [2019-06-04 11:52:09.615545] D [monitor(monitor):271:monitor] >>>>>>>> Monitor: worker(/home/sas/gluster/data/code-misc) connected >>>>>>>> [2019-06-04 11:52:09.616528] I [monitor(monitor):278:monitor] >>>>>>>> Monitor: worker died in startup phase brick=/home/sas/gluster/data/code-misc >>>>>>>> [2019-06-04 11:52:09.619391] I >>>>>>>> [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status >>>>>>>> Change status=Faulty >>>>>>>> >>>>>>> >>>>>>> *Slave log snippet* >>>>>>> >>>>>>>> [2019-06-04 11:50:09.782668] E [syncdutils(slave >>>>>>>> 192.168.185.106/home/sas/gluster/data/code-misc):809:logerr] >>>>>>>> Popen: /usr/sbin/gluster> 2 : failed with this errno (No such file or >>>>>>>> directory) >>>>>>>> [2019-06-04 11:50:11.188167] W [gsyncd(slave >>>>>>>> 192.168.185.125/home/sas/gluster/data/code-misc):305:main] <top>: >>>>>>>> Session config file not exists, using the default config >>>>>>>> path=/var/lib/glusterd/geo-replication/code-misc_192.168.185.107_code-misc/gsyncd.conf >>>>>>>> [2019-06-04 11:50:11.201070] I [resource(slave >>>>>>>> 192.168.185.125/home/sas/gluster/data/code-misc):1098:connect] >>>>>>>> GLUSTER: Mounting gluster volume locally... >>>>>>>> [2019-06-04 11:50:11.271231] E [resource(slave >>>>>>>> 192.168.185.125/home/sas/gluster/data/code-misc):1006:handle_mounter] >>>>>>>> MountbrokerMounter: glusterd answered mnt>>>>>>>> [2019-06-04 11:50:11.271998] E [syncdutils(slave >>>>>>>> 192.168.185.125/home/sas/gluster/data/code-misc):805:errlog] >>>>>>>> Popen: command returned error cmd=/usr/sbin/gluster --remote-host=localhost >>>>>>>> system:: mount sas user-map-root=sas aux-gfid-mount acl log-level=INFO >>>>>>>> log-file=/var/log/glusterfs/geo-replication-slaves/code-misc_192.168.185.107_code-misc/mnt-192.168.185.125-home-sas-gluster-data-code-misc.log >>>>>>>> volfile-server=localhost volfile-id=code-misc client-pid=-1 error=1 >>>>>>>> [2019-06-04 11:50:11.272113] E [syncdutils(slave >>>>>>>> 192.168.185.125/home/sas/gluster/data/code-misc):809:logerr] >>>>>>>> Popen: /usr/sbin/gluster> 2 : failed with this errno (No such file or >>>>>>>> directory) >>>>>>> >>>>>>> >>>>>>> On Tue, Jun 4, 2019 at 5:10 PM deepu srinivasan <sdeepugd at gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi >>>>>>>> As discussed I have upgraded gluster from 4.1 to 6.2 version. But >>>>>>>> the Geo replication failed to start. >>>>>>>> Stays in faulty state >>>>>>>> >>>>>>>> On Fri, May 31, 2019, 5:32 PM deepu srinivasan <sdeepugd at gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Checked the data. It remains in 2708. No progress. >>>>>>>>> >>>>>>>>> On Fri, May 31, 2019 at 4:36 PM Kotresh Hiremath Ravishankar < >>>>>>>>> khiremat at redhat.com> wrote: >>>>>>>>> >>>>>>>>>> That means it could be working and the defunct process might be >>>>>>>>>> some old zombie one. Could you check, that data progress ? >>>>>>>>>> >>>>>>>>>> On Fri, May 31, 2019 at 4:29 PM deepu srinivasan < >>>>>>>>>> sdeepugd at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi >>>>>>>>>>> When i change the rsync option the rsync process doesnt seem to >>>>>>>>>>> start . Only a defunt process is listed in ps aux. Only when i set rsync >>>>>>>>>>> option to " " and restart all the process the rsync process is listed in ps >>>>>>>>>>> aux. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, May 31, 2019 at 4:23 PM Kotresh Hiremath Ravishankar < >>>>>>>>>>> khiremat at redhat.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Yes, rsync config option should have fixed this issue. >>>>>>>>>>>> >>>>>>>>>>>> Could you share the output of the following? >>>>>>>>>>>> >>>>>>>>>>>> 1. gluster volume geo-replication <MASTERVOL> >>>>>>>>>>>> <SLAVEHOST>::<SLAVEVOL> config rsync-options >>>>>>>>>>>> 2. ps -ef | grep rsync >>>>>>>>>>>> >>>>>>>>>>>> On Fri, May 31, 2019 at 4:11 PM deepu srinivasan < >>>>>>>>>>>> sdeepugd at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Done. >>>>>>>>>>>>> We got the following result . >>>>>>>>>>>>> >>>>>>>>>>>>>> 1559298781.338234 write(2, "rsync: link_stat >>>>>>>>>>>>>> \"/tmp/gsyncd-aux-mount-EEJ_sY/.gfid/3fa6aed8-802e-4efe-9903-8bc171176d88\" >>>>>>>>>>>>>> failed: No such file or directory (2)", 128 >>>>>>>>>>>>> >>>>>>>>>>>>> seems like a file is missing ? >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, May 31, 2019 at 3:25 PM Kotresh Hiremath Ravishankar < >>>>>>>>>>>>> khiremat at redhat.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Could you take the strace with with more string size? The >>>>>>>>>>>>>> argument strings are truncated. >>>>>>>>>>>>>> >>>>>>>>>>>>>> strace -s 500 -ttt -T -p <rsync pid> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, May 31, 2019 at 3:17 PM deepu srinivasan < >>>>>>>>>>>>>> sdeepugd at gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Kotresh >>>>>>>>>>>>>>> The above-mentioned work around did not work properly. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, May 31, 2019 at 3:16 PM deepu srinivasan < >>>>>>>>>>>>>>> sdeepugd at gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi Kotresh >>>>>>>>>>>>>>>> We have tried the above-mentioned rsync option and we are >>>>>>>>>>>>>>>> planning to have the version upgrade to 6.0. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, May 31, 2019 at 11:04 AM Kotresh Hiremath >>>>>>>>>>>>>>>> Ravishankar <khiremat at redhat.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> This looks like the hang because stderr buffer filled up >>>>>>>>>>>>>>>>> with errors messages and no one reading it. >>>>>>>>>>>>>>>>> I think this issue is fixed in latest releases. As a >>>>>>>>>>>>>>>>> workaround, you can do following and check if it works. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Prerequisite: >>>>>>>>>>>>>>>>> rsync version should be > 3.1.0 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Workaround: >>>>>>>>>>>>>>>>> gluster volume geo-replication <MASTERVOL> >>>>>>>>>>>>>>>>> <SLAVEHOST>::<SLAVEVOL> config rsync-options "--ignore- >>>>>>>>>>>>>>>>> missing-args" >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Kotresh HR >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Thu, May 30, 2019 at 5:39 PM deepu srinivasan < >>>>>>>>>>>>>>>>> sdeepugd at gmail.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi >>>>>>>>>>>>>>>>>> We were evaluating Gluster geo Replication between two >>>>>>>>>>>>>>>>>> DCs one is in US west and one is in US east. We took multiple trials for >>>>>>>>>>>>>>>>>> different file size. >>>>>>>>>>>>>>>>>> The Geo Replication tends to stop replicating but while >>>>>>>>>>>>>>>>>> checking the status it appears to be in Active state. But the slave volume >>>>>>>>>>>>>>>>>> did not increase in size. >>>>>>>>>>>>>>>>>> So we have restarted the geo-replication session and >>>>>>>>>>>>>>>>>> checked the status. The status was in an active state and it was in History >>>>>>>>>>>>>>>>>> Crawl for a long time. We have enabled the DEBUG mode in logging and >>>>>>>>>>>>>>>>>> checked for any error. >>>>>>>>>>>>>>>>>> There was around 2000 file appeared for syncing >>>>>>>>>>>>>>>>>> candidate. The Rsync process starts but the rsync did not happen in the >>>>>>>>>>>>>>>>>> slave volume. Every time the rsync process appears in the "ps auxxx" list >>>>>>>>>>>>>>>>>> but the replication did not happen in the slave end. What would be the >>>>>>>>>>>>>>>>>> cause of this problem? Is there anyway to debug it? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> We have also checked the strace of the rync program. >>>>>>>>>>>>>>>>>> it displays something like this >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> "write(2, "rsync: link_stat \"/tmp/gsyncd-au"..., 128" >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> We are using the below specs >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Gluster version - 4.1.7 >>>>>>>>>>>>>>>>>> Sync mode - rsync >>>>>>>>>>>>>>>>>> Volume - 1x3 in each end (master and slave) >>>>>>>>>>>>>>>>>> Intranet Bandwidth - 10 Gig >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>>> Kotresh H R >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>> Kotresh H R >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>> Kotresh H R >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Thanks and Regards, >>>>>>>>>> Kotresh H R >>>>>>>>>> >>>>>>>>> >>>>>> >>>>>> -- >>>>>> Thanks and Regards, >>>>>> Kotresh H R >>>>>> >>>>> >>>> >>>> -- >>>> Thanks and Regards, >>>> Kotresh H R >>>> >>>-- Thanks and Regards, Kotresh H R -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190606/5543c358/attachment-0001.html>