Hi Kotresh We have tried the above-mentioned rsync option and we are planning to have the version upgrade to 6.0. On Fri, May 31, 2019 at 11:04 AM Kotresh Hiremath Ravishankar < khiremat at redhat.com> wrote:> Hi, > > This looks like the hang because stderr buffer filled up with errors > messages and no one reading it. > I think this issue is fixed in latest releases. As a workaround, you can > do following and check if it works. > > Prerequisite: > rsync version should be > 3.1.0 > > Workaround: > gluster volume geo-replication <MASTERVOL> <SLAVEHOST>::<SLAVEVOL> config > rsync-options "--ignore-missing-args" > > Thanks, > Kotresh HR > > > > > On Thu, May 30, 2019 at 5:39 PM deepu srinivasan <sdeepugd at gmail.com> > wrote: > >> Hi >> We were evaluating Gluster geo Replication between two DCs one is in US >> west and one is in US east. We took multiple trials for different file >> size. >> The Geo Replication tends to stop replicating but while checking the >> status it appears to be in Active state. But the slave volume did not >> increase in size. >> So we have restarted the geo-replication session and checked the status. >> The status was in an active state and it was in History Crawl for a long >> time. We have enabled the DEBUG mode in logging and checked for any error. >> There was around 2000 file appeared for syncing candidate. The Rsync >> process starts but the rsync did not happen in the slave volume. Every time >> the rsync process appears in the "ps auxxx" list but the replication did >> not happen in the slave end. What would be the cause of this problem? Is >> there anyway to debug it? >> >> We have also checked the strace of the rync program. >> it displays something like this >> >> "write(2, "rsync: link_stat \"/tmp/gsyncd-au"..., 128" >> >> >> We are using the below specs >> >> Gluster version - 4.1.7 >> Sync mode - rsync >> Volume - 1x3 in each end (master and slave) >> Intranet Bandwidth - 10 Gig >> > > > -- > Thanks and Regards, > Kotresh H R >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190531/002b27e6/attachment-0001.html>
Hi Kotresh The above-mentioned work around did not work properly. On Fri, May 31, 2019 at 3:16 PM deepu srinivasan <sdeepugd at gmail.com> wrote:> Hi Kotresh > We have tried the above-mentioned rsync option and we are planning to have > the version upgrade to 6.0. > > On Fri, May 31, 2019 at 11:04 AM Kotresh Hiremath Ravishankar < > khiremat at redhat.com> wrote: > >> Hi, >> >> This looks like the hang because stderr buffer filled up with errors >> messages and no one reading it. >> I think this issue is fixed in latest releases. As a workaround, you can >> do following and check if it works. >> >> Prerequisite: >> rsync version should be > 3.1.0 >> >> Workaround: >> gluster volume geo-replication <MASTERVOL> <SLAVEHOST>::<SLAVEVOL> config >> rsync-options "--ignore-missing-args" >> >> Thanks, >> Kotresh HR >> >> >> >> >> On Thu, May 30, 2019 at 5:39 PM deepu srinivasan <sdeepugd at gmail.com> >> wrote: >> >>> Hi >>> We were evaluating Gluster geo Replication between two DCs one is in US >>> west and one is in US east. We took multiple trials for different file >>> size. >>> The Geo Replication tends to stop replicating but while checking the >>> status it appears to be in Active state. But the slave volume did not >>> increase in size. >>> So we have restarted the geo-replication session and checked the status. >>> The status was in an active state and it was in History Crawl for a long >>> time. We have enabled the DEBUG mode in logging and checked for any error. >>> There was around 2000 file appeared for syncing candidate. The Rsync >>> process starts but the rsync did not happen in the slave volume. Every time >>> the rsync process appears in the "ps auxxx" list but the replication did >>> not happen in the slave end. What would be the cause of this problem? Is >>> there anyway to debug it? >>> >>> We have also checked the strace of the rync program. >>> it displays something like this >>> >>> "write(2, "rsync: link_stat \"/tmp/gsyncd-au"..., 128" >>> >>> >>> We are using the below specs >>> >>> Gluster version - 4.1.7 >>> Sync mode - rsync >>> Volume - 1x3 in each end (master and slave) >>> Intranet Bandwidth - 10 Gig >>> >> >> >> -- >> Thanks and Regards, >> Kotresh H R >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190531/cb09098f/attachment-0001.html>
Kotresh Hiremath Ravishankar
2019-May-31 09:55 UTC
[Gluster-users] Geo Replication stops replicating
Hi, Could you take the strace with with more string size? The argument strings are truncated. strace -s 500 -ttt -T -p <rsync pid> On Fri, May 31, 2019 at 3:17 PM deepu srinivasan <sdeepugd at gmail.com> wrote:> Hi Kotresh > The above-mentioned work around did not work properly. > > On Fri, May 31, 2019 at 3:16 PM deepu srinivasan <sdeepugd at gmail.com> > wrote: > >> Hi Kotresh >> We have tried the above-mentioned rsync option and we are planning to >> have the version upgrade to 6.0. >> >> On Fri, May 31, 2019 at 11:04 AM Kotresh Hiremath Ravishankar < >> khiremat at redhat.com> wrote: >> >>> Hi, >>> >>> This looks like the hang because stderr buffer filled up with errors >>> messages and no one reading it. >>> I think this issue is fixed in latest releases. As a workaround, you can >>> do following and check if it works. >>> >>> Prerequisite: >>> rsync version should be > 3.1.0 >>> >>> Workaround: >>> gluster volume geo-replication <MASTERVOL> <SLAVEHOST>::<SLAVEVOL> >>> config rsync-options "--ignore-missing-args" >>> >>> Thanks, >>> Kotresh HR >>> >>> >>> >>> >>> On Thu, May 30, 2019 at 5:39 PM deepu srinivasan <sdeepugd at gmail.com> >>> wrote: >>> >>>> Hi >>>> We were evaluating Gluster geo Replication between two DCs one is in US >>>> west and one is in US east. We took multiple trials for different file >>>> size. >>>> The Geo Replication tends to stop replicating but while checking the >>>> status it appears to be in Active state. But the slave volume did not >>>> increase in size. >>>> So we have restarted the geo-replication session and checked the >>>> status. The status was in an active state and it was in History Crawl for a >>>> long time. We have enabled the DEBUG mode in logging and checked for any >>>> error. >>>> There was around 2000 file appeared for syncing candidate. The Rsync >>>> process starts but the rsync did not happen in the slave volume. Every time >>>> the rsync process appears in the "ps auxxx" list but the replication did >>>> not happen in the slave end. What would be the cause of this problem? Is >>>> there anyway to debug it? >>>> >>>> We have also checked the strace of the rync program. >>>> it displays something like this >>>> >>>> "write(2, "rsync: link_stat \"/tmp/gsyncd-au"..., 128" >>>> >>>> >>>> We are using the below specs >>>> >>>> Gluster version - 4.1.7 >>>> Sync mode - rsync >>>> Volume - 1x3 in each end (master and slave) >>>> Intranet Bandwidth - 10 Gig >>>> >>> >>> >>> -- >>> Thanks and Regards, >>> Kotresh H R >>> >>-- Thanks and Regards, Kotresh H R -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190531/b89a13b6/attachment.html>