Kotresh Hiremath Ravishankar
2019-May-31 10:52 UTC
[Gluster-users] Geo Replication stops replicating
Yes, rsync config option should have fixed this issue. Could you share the output of the following? 1. gluster volume geo-replication <MASTERVOL> <SLAVEHOST>::<SLAVEVOL> config rsync-options 2. ps -ef | grep rsync On Fri, May 31, 2019 at 4:11 PM deepu srinivasan <sdeepugd at gmail.com> wrote:> Done. > We got the following result . > >> 1559298781.338234 write(2, "rsync: link_stat >> \"/tmp/gsyncd-aux-mount-EEJ_sY/.gfid/3fa6aed8-802e-4efe-9903-8bc171176d88\" >> failed: No such file or directory (2)", 128 > > seems like a file is missing ? > > On Fri, May 31, 2019 at 3:25 PM Kotresh Hiremath Ravishankar < > khiremat at redhat.com> wrote: > >> Hi, >> >> Could you take the strace with with more string size? The argument >> strings are truncated. >> >> strace -s 500 -ttt -T -p <rsync pid> >> >> On Fri, May 31, 2019 at 3:17 PM deepu srinivasan <sdeepugd at gmail.com> >> wrote: >> >>> Hi Kotresh >>> The above-mentioned work around did not work properly. >>> >>> On Fri, May 31, 2019 at 3:16 PM deepu srinivasan <sdeepugd at gmail.com> >>> wrote: >>> >>>> Hi Kotresh >>>> We have tried the above-mentioned rsync option and we are planning to >>>> have the version upgrade to 6.0. >>>> >>>> On Fri, May 31, 2019 at 11:04 AM Kotresh Hiremath Ravishankar < >>>> khiremat at redhat.com> wrote: >>>> >>>>> Hi, >>>>> >>>>> This looks like the hang because stderr buffer filled up with errors >>>>> messages and no one reading it. >>>>> I think this issue is fixed in latest releases. As a workaround, you >>>>> can do following and check if it works. >>>>> >>>>> Prerequisite: >>>>> rsync version should be > 3.1.0 >>>>> >>>>> Workaround: >>>>> gluster volume geo-replication <MASTERVOL> <SLAVEHOST>::<SLAVEVOL> >>>>> config rsync-options "--ignore-missing-args" >>>>> >>>>> Thanks, >>>>> Kotresh HR >>>>> >>>>> >>>>> >>>>> >>>>> On Thu, May 30, 2019 at 5:39 PM deepu srinivasan <sdeepugd at gmail.com> >>>>> wrote: >>>>> >>>>>> Hi >>>>>> We were evaluating Gluster geo Replication between two DCs one is in >>>>>> US west and one is in US east. We took multiple trials for different file >>>>>> size. >>>>>> The Geo Replication tends to stop replicating but while checking the >>>>>> status it appears to be in Active state. But the slave volume did not >>>>>> increase in size. >>>>>> So we have restarted the geo-replication session and checked the >>>>>> status. The status was in an active state and it was in History Crawl for a >>>>>> long time. We have enabled the DEBUG mode in logging and checked for any >>>>>> error. >>>>>> There was around 2000 file appeared for syncing candidate. The Rsync >>>>>> process starts but the rsync did not happen in the slave volume. Every time >>>>>> the rsync process appears in the "ps auxxx" list but the replication did >>>>>> not happen in the slave end. What would be the cause of this problem? Is >>>>>> there anyway to debug it? >>>>>> >>>>>> We have also checked the strace of the rync program. >>>>>> it displays something like this >>>>>> >>>>>> "write(2, "rsync: link_stat \"/tmp/gsyncd-au"..., 128" >>>>>> >>>>>> >>>>>> We are using the below specs >>>>>> >>>>>> Gluster version - 4.1.7 >>>>>> Sync mode - rsync >>>>>> Volume - 1x3 in each end (master and slave) >>>>>> Intranet Bandwidth - 10 Gig >>>>>> >>>>> >>>>> >>>>> -- >>>>> Thanks and Regards, >>>>> Kotresh H R >>>>> >>>> >> >> -- >> Thanks and Regards, >> Kotresh H R >> >-- Thanks and Regards, Kotresh H R -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190531/c5663c74/attachment.html>
Hi When i change the rsync option the rsync process doesnt seem to start . Only a defunt process is listed in ps aux. Only when i set rsync option to " " and restart all the process the rsync process is listed in ps aux. On Fri, May 31, 2019 at 4:23 PM Kotresh Hiremath Ravishankar < khiremat at redhat.com> wrote:> Yes, rsync config option should have fixed this issue. > > Could you share the output of the following? > > 1. gluster volume geo-replication <MASTERVOL> <SLAVEHOST>::<SLAVEVOL> > config rsync-options > 2. ps -ef | grep rsync > > On Fri, May 31, 2019 at 4:11 PM deepu srinivasan <sdeepugd at gmail.com> > wrote: > >> Done. >> We got the following result . >> >>> 1559298781.338234 write(2, "rsync: link_stat >>> \"/tmp/gsyncd-aux-mount-EEJ_sY/.gfid/3fa6aed8-802e-4efe-9903-8bc171176d88\" >>> failed: No such file or directory (2)", 128 >> >> seems like a file is missing ? >> >> On Fri, May 31, 2019 at 3:25 PM Kotresh Hiremath Ravishankar < >> khiremat at redhat.com> wrote: >> >>> Hi, >>> >>> Could you take the strace with with more string size? The argument >>> strings are truncated. >>> >>> strace -s 500 -ttt -T -p <rsync pid> >>> >>> On Fri, May 31, 2019 at 3:17 PM deepu srinivasan <sdeepugd at gmail.com> >>> wrote: >>> >>>> Hi Kotresh >>>> The above-mentioned work around did not work properly. >>>> >>>> On Fri, May 31, 2019 at 3:16 PM deepu srinivasan <sdeepugd at gmail.com> >>>> wrote: >>>> >>>>> Hi Kotresh >>>>> We have tried the above-mentioned rsync option and we are planning to >>>>> have the version upgrade to 6.0. >>>>> >>>>> On Fri, May 31, 2019 at 11:04 AM Kotresh Hiremath Ravishankar < >>>>> khiremat at redhat.com> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> This looks like the hang because stderr buffer filled up with errors >>>>>> messages and no one reading it. >>>>>> I think this issue is fixed in latest releases. As a workaround, you >>>>>> can do following and check if it works. >>>>>> >>>>>> Prerequisite: >>>>>> rsync version should be > 3.1.0 >>>>>> >>>>>> Workaround: >>>>>> gluster volume geo-replication <MASTERVOL> <SLAVEHOST>::<SLAVEVOL> >>>>>> config rsync-options "--ignore-missing-args" >>>>>> >>>>>> Thanks, >>>>>> Kotresh HR >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Thu, May 30, 2019 at 5:39 PM deepu srinivasan <sdeepugd at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi >>>>>>> We were evaluating Gluster geo Replication between two DCs one is in >>>>>>> US west and one is in US east. We took multiple trials for different file >>>>>>> size. >>>>>>> The Geo Replication tends to stop replicating but while checking the >>>>>>> status it appears to be in Active state. But the slave volume did not >>>>>>> increase in size. >>>>>>> So we have restarted the geo-replication session and checked the >>>>>>> status. The status was in an active state and it was in History Crawl for a >>>>>>> long time. We have enabled the DEBUG mode in logging and checked for any >>>>>>> error. >>>>>>> There was around 2000 file appeared for syncing candidate. The Rsync >>>>>>> process starts but the rsync did not happen in the slave volume. Every time >>>>>>> the rsync process appears in the "ps auxxx" list but the replication did >>>>>>> not happen in the slave end. What would be the cause of this problem? Is >>>>>>> there anyway to debug it? >>>>>>> >>>>>>> We have also checked the strace of the rync program. >>>>>>> it displays something like this >>>>>>> >>>>>>> "write(2, "rsync: link_stat \"/tmp/gsyncd-au"..., 128" >>>>>>> >>>>>>> >>>>>>> We are using the below specs >>>>>>> >>>>>>> Gluster version - 4.1.7 >>>>>>> Sync mode - rsync >>>>>>> Volume - 1x3 in each end (master and slave) >>>>>>> Intranet Bandwidth - 10 Gig >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Thanks and Regards, >>>>>> Kotresh H R >>>>>> >>>>> >>> >>> -- >>> Thanks and Regards, >>> Kotresh H R >>> >> > > -- > Thanks and Regards, > Kotresh H R >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190531/d20c2c85/attachment-0001.html>
Kotresh Hiremath Ravishankar
2019-May-31 11:05 UTC
[Gluster-users] Geo Replication stops replicating
That means it could be working and the defunct process might be some old zombie one. Could you check, that data progress ? On Fri, May 31, 2019 at 4:29 PM deepu srinivasan <sdeepugd at gmail.com> wrote:> Hi > When i change the rsync option the rsync process doesnt seem to start . > Only a defunt process is listed in ps aux. Only when i set rsync option to > " " and restart all the process the rsync process is listed in ps aux. > > > On Fri, May 31, 2019 at 4:23 PM Kotresh Hiremath Ravishankar < > khiremat at redhat.com> wrote: > >> Yes, rsync config option should have fixed this issue. >> >> Could you share the output of the following? >> >> 1. gluster volume geo-replication <MASTERVOL> <SLAVEHOST>::<SLAVEVOL> >> config rsync-options >> 2. ps -ef | grep rsync >> >> On Fri, May 31, 2019 at 4:11 PM deepu srinivasan <sdeepugd at gmail.com> >> wrote: >> >>> Done. >>> We got the following result . >>> >>>> 1559298781.338234 write(2, "rsync: link_stat >>>> \"/tmp/gsyncd-aux-mount-EEJ_sY/.gfid/3fa6aed8-802e-4efe-9903-8bc171176d88\" >>>> failed: No such file or directory (2)", 128 >>> >>> seems like a file is missing ? >>> >>> On Fri, May 31, 2019 at 3:25 PM Kotresh Hiremath Ravishankar < >>> khiremat at redhat.com> wrote: >>> >>>> Hi, >>>> >>>> Could you take the strace with with more string size? The argument >>>> strings are truncated. >>>> >>>> strace -s 500 -ttt -T -p <rsync pid> >>>> >>>> On Fri, May 31, 2019 at 3:17 PM deepu srinivasan <sdeepugd at gmail.com> >>>> wrote: >>>> >>>>> Hi Kotresh >>>>> The above-mentioned work around did not work properly. >>>>> >>>>> On Fri, May 31, 2019 at 3:16 PM deepu srinivasan <sdeepugd at gmail.com> >>>>> wrote: >>>>> >>>>>> Hi Kotresh >>>>>> We have tried the above-mentioned rsync option and we are planning to >>>>>> have the version upgrade to 6.0. >>>>>> >>>>>> On Fri, May 31, 2019 at 11:04 AM Kotresh Hiremath Ravishankar < >>>>>> khiremat at redhat.com> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> This looks like the hang because stderr buffer filled up with errors >>>>>>> messages and no one reading it. >>>>>>> I think this issue is fixed in latest releases. As a workaround, you >>>>>>> can do following and check if it works. >>>>>>> >>>>>>> Prerequisite: >>>>>>> rsync version should be > 3.1.0 >>>>>>> >>>>>>> Workaround: >>>>>>> gluster volume geo-replication <MASTERVOL> <SLAVEHOST>::<SLAVEVOL> >>>>>>> config rsync-options "--ignore-missing-args" >>>>>>> >>>>>>> Thanks, >>>>>>> Kotresh HR >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, May 30, 2019 at 5:39 PM deepu srinivasan <sdeepugd at gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi >>>>>>>> We were evaluating Gluster geo Replication between two DCs one is >>>>>>>> in US west and one is in US east. We took multiple trials for different >>>>>>>> file size. >>>>>>>> The Geo Replication tends to stop replicating but while checking >>>>>>>> the status it appears to be in Active state. But the slave volume did not >>>>>>>> increase in size. >>>>>>>> So we have restarted the geo-replication session and checked the >>>>>>>> status. The status was in an active state and it was in History Crawl for a >>>>>>>> long time. We have enabled the DEBUG mode in logging and checked for any >>>>>>>> error. >>>>>>>> There was around 2000 file appeared for syncing candidate. The >>>>>>>> Rsync process starts but the rsync did not happen in the slave volume. >>>>>>>> Every time the rsync process appears in the "ps auxxx" list but the >>>>>>>> replication did not happen in the slave end. What would be the cause of >>>>>>>> this problem? Is there anyway to debug it? >>>>>>>> >>>>>>>> We have also checked the strace of the rync program. >>>>>>>> it displays something like this >>>>>>>> >>>>>>>> "write(2, "rsync: link_stat \"/tmp/gsyncd-au"..., 128" >>>>>>>> >>>>>>>> >>>>>>>> We are using the below specs >>>>>>>> >>>>>>>> Gluster version - 4.1.7 >>>>>>>> Sync mode - rsync >>>>>>>> Volume - 1x3 in each end (master and slave) >>>>>>>> Intranet Bandwidth - 10 Gig >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Thanks and Regards, >>>>>>> Kotresh H R >>>>>>> >>>>>> >>>> >>>> -- >>>> Thanks and Regards, >>>> Kotresh H R >>>> >>> >> >> -- >> Thanks and Regards, >> Kotresh H R >> >-- Thanks and Regards, Kotresh H R -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190531/97e6ad20/attachment.html>