thr3ads.net - Gluster users - [Gluster-users] geo-replication sync issue [Mar 2020]

If this information is useful, please help other people find it:
Share via:

Etem Bayoğlu

2020-Mar-11 20:17 UTC

[Gluster-users] geo-replication sync issue

Hi Strahil,

Thank you for your response. when I tail logs on both master and slave I
get this:

on slave, from
/var/log/glusterfs/geo-replication-slaves/<geo-session>/mnt-XXX.log file:

[2020-03-11 19:53:32.721509] E [fuse-bridge.c:227:check_and_dump_fuse_W]
(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13a)[0x7f78e10488ea]
(-->
/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x8221)[0x7f78d83f6221] (-->
/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x9998)[0x7f78d83f7998] (-->
/lib64/libpthread.so.0(+0x7e65)[0x7f78dfe89e65] (-->
/lib64/libc.so.6(clone+0x6d)[0x7f78df74f88d] ))))) 0-glusterfs-fuse:
writing to fuse device failed: No such file or directory
[2020-03-11 19:53:32.723758] E [fuse-bridge.c:227:check_and_dump_fuse_W]
(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13a)[0x7f78e10488ea]
(-->
/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x8221)[0x7f78d83f6221] (-->
/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x9998)[0x7f78d83f7998] (-->
/lib64/libpthread.so.0(+0x7e65)[0x7f78dfe89e65] (-->
/lib64/libc.so.6(clone+0x6d)[0x7f78df74f88d] ))))) 0-glusterfs-fuse:
writing to fuse device failed: No such file or directory

on master,
from /var/log/glusterfs/geo-replication/<geo-session>/mnt-XXX.log file:

[2020-03-11 19:40:55.872002] E [fuse-bridge.c:4188:fuse_xattr_cbk]
0-glusterfs-fuse: extended attribute not supported by the backend storage
[2020-03-11 19:40:58.389748] E [fuse-bridge.c:227:check_and_dump_fuse_W]
(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13a)[0x7f1f4b9108ea]
(-->
/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x8221)[0x7f1f42cc2221] (-->
/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x9998)[0x7f1f42cc3998] (-->
/lib64/libpthread.so.0(+0x7e25)[0x7f1f4a751e25] (-->
/lib64/libc.so.6(clone+0x6d)[0x7f1f4a01abad] ))))) 0-glusterfs-fuse:
writing to fuse device failed: No such file or directory
[2020-03-11 19:41:08.214591] E [fuse-bridge.c:227:check_and_dump_fuse_W]
(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13a)[0x7f1f4b9108ea]
(-->
/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x8221)[0x7f1f42cc2221] (-->
/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x9998)[0x7f1f42cc3998] (-->
/lib64/libpthread.so.0(+0x7e25)[0x7f1f4a751e25] (-->
/lib64/libc.so.6(clone+0x6d)[0x7f1f4a01abad] ))))) 0-glusterfs-fuse:
writing to fuse device failed: No such file or directory
[2020-03-11 19:53:59.275469] E [fuse-bridge.c:227:check_and_dump_fuse_W]
(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13a)[0x7f1f4b9108ea]
(-->
/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x8221)[0x7f1f42cc2221] (-->
/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x9998)[0x7f1f42cc3998] (-->
/lib64/libpthread.so.0(+0x7e25)[0x7f1f4a751e25] (-->
/lib64/libc.so.6(clone+0x6d)[0x7f1f4a01abad] ))))) 0-glusterfs-fuse:
writing to fuse device failed: No such file or directory

####################gsyncd.log outputs:######################

from slave:
[2020-03-11 08:55:16.384085] I [repce(slave
master-node/srv/media-storage):96:service_loop] RepceServer: terminating on
reaching EOF.
[2020-03-11 08:57:55.87364] I [resource(slave
master-node/srv/media-storage):1105:connect] GLUSTER: Mounting gluster
volume locally...
[2020-03-11 08:57:56.171372] I [resource(slave
master-node/srv/media-storage):1128:connect] GLUSTER: Mounted gluster
volume duration=1.0837
[2020-03-11 08:57:56.173346] I [resource(slave
master-node/srv/media-storage):1155:service_loop] GLUSTER: slave listening

from master:
[2020-03-11 20:08:55.145453] I [master(worker
/srv/media-storage):1991:syncjob] Syncer: Sync Time Taken
duration=134.9987num_files=4661 job=2 return_code=0
[2020-03-11 20:08:55.285871] I [master(worker
/srv/media-storage):1421:process] _GMaster: Entry Time Taken MKD=83
MKN=8109 LIN=0 SYM=0 REN=0 RMD=0 CRE=0 duration=17.0358 UNL=0
[2020-03-11 20:08:55.286082] I [master(worker
/srv/media-storage):1431:process] _GMaster: Data/Metadata Time Taken
SETA=83 SETX=0 meta_duration=0.9334 data_duration=135.2497 DATA=8109 XATT=0
[2020-03-11 20:08:55.286410] I [master(worker
/srv/media-storage):1441:process] _GMaster: Batch Completed
changelog_end=1583917610 entry_stime=None changelog_start=1583917610
stime=None duration=153.5185 num_changelogs=1 mode=xsync
[2020-03-11 20:08:55.315442] I [master(worker
/srv/media-storage):1681:crawl] _GMaster: processing xsync changelog
path=/var/lib/misc/gluster/gsyncd/media-storage_daredevil01.zingat.com_dr-media/srv-media-storage/xsync/XSYNC-CHANGELOG.1583917613


Thank you..

Strahil Nikolov <hunter86_bg at yahoo.com>, 11 Mar 2020 ?ar, 12:28
tarihinde
?unu yazd?:
> On March 11, 2020 10:09:27 AM GMT+02:00, "Etem Bayo?lu" <
> etembayoglu at gmail.com> wrote:
> >Hello community,
> >
> >I've set up a glusterfs geo-replication node for disaster recovery.
I
> >manage about 10TB media data on a gluster volume and I want to sync all
> >data to remote location over WAN. So, I created a slave node volume on
> >disaster recovery center on remote location and I've started
geo-rep
> >session. It has been transferred data fine up to about 800GB, but
> >syncing
> >has stopped for three days despite gluster geo-rep status active and
> >hybrid
> >crawl. There is no sending data. I've recreated session and
restarted
> >but
> >still the same.
> >
> >#gluster volu geo-rep status
> >
> >MASTER NODE            MASTER VOL       MASTER BRICK          SLAVE
> >USER
> >SLAVE                                     SLAVE NODE
> >STATUS
> >   CRAWL STATUS    LAST_SYNCED
>
>
>------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >master-node   media-storage    /srv/media-storage    root
> > ssh://slave-node::dr-media    slave-node          Active
> >Hybrid Crawl                 N/A
> >
> >Any idea? please. Thank you.
>
> Hi Etem,
>
> Have you checked the log on both source and destination. Maybe they can
> hint you what the issue is.
>
> Best Regards,
> Strahil Nikolov
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20200311/dc94148e/attachment.html>

Strahil Nikolov

2020-Mar-12 04:55 UTC

head link

[Gluster-users] geo-replication sync issue

On March 11, 2020 10:17:05 PM GMT+02:00, "Etem Bayo?lu"
<etembayoglu at gmail.com> wrote:>Hi Strahil,
>
>Thank you for your response. when I tail logs on both master and slave
>I
>get this:
>
>on slave, from
>/var/log/glusterfs/geo-replication-slaves/<geo-session>/mnt-XXX.log
>file:
>
>[2020-03-11 19:53:32.721509] E
>[fuse-bridge.c:227:check_and_dump_fuse_W]
>(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13a)[0x7f78e10488ea]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x8221)[0x7f78d83f6221]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x9998)[0x7f78d83f7998]
>(-->
>/lib64/libpthread.so.0(+0x7e65)[0x7f78dfe89e65] (-->
>/lib64/libc.so.6(clone+0x6d)[0x7f78df74f88d] ))))) 0-glusterfs-fuse:
>writing to fuse device failed: No such file or directory
>[2020-03-11 19:53:32.723758] E
>[fuse-bridge.c:227:check_and_dump_fuse_W]
>(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13a)[0x7f78e10488ea]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x8221)[0x7f78d83f6221]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x9998)[0x7f78d83f7998]
>(-->
>/lib64/libpthread.so.0(+0x7e65)[0x7f78dfe89e65] (-->
>/lib64/libc.so.6(clone+0x6d)[0x7f78df74f88d] ))))) 0-glusterfs-fuse:
>writing to fuse device failed: No such file or directory
>
>on master,
>from /var/log/glusterfs/geo-replication/<geo-session>/mnt-XXX.log
file:
>
>[2020-03-11 19:40:55.872002] E [fuse-bridge.c:4188:fuse_xattr_cbk]
>0-glusterfs-fuse: extended attribute not supported by the backend
>storage
>[2020-03-11 19:40:58.389748] E
>[fuse-bridge.c:227:check_and_dump_fuse_W]
>(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13a)[0x7f1f4b9108ea]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x8221)[0x7f1f42cc2221]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x9998)[0x7f1f42cc3998]
>(-->
>/lib64/libpthread.so.0(+0x7e25)[0x7f1f4a751e25] (-->
>/lib64/libc.so.6(clone+0x6d)[0x7f1f4a01abad] ))))) 0-glusterfs-fuse:
>writing to fuse device failed: No such file or directory
>[2020-03-11 19:41:08.214591] E
>[fuse-bridge.c:227:check_and_dump_fuse_W]
>(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13a)[0x7f1f4b9108ea]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x8221)[0x7f1f42cc2221]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x9998)[0x7f1f42cc3998]
>(-->
>/lib64/libpthread.so.0(+0x7e25)[0x7f1f4a751e25] (-->
>/lib64/libc.so.6(clone+0x6d)[0x7f1f4a01abad] ))))) 0-glusterfs-fuse:
>writing to fuse device failed: No such file or directory
>[2020-03-11 19:53:59.275469] E
>[fuse-bridge.c:227:check_and_dump_fuse_W]
>(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13a)[0x7f1f4b9108ea]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x8221)[0x7f1f42cc2221]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x9998)[0x7f1f42cc3998]
>(-->
>/lib64/libpthread.so.0(+0x7e25)[0x7f1f4a751e25] (-->
>/lib64/libc.so.6(clone+0x6d)[0x7f1f4a01abad] ))))) 0-glusterfs-fuse:
>writing to fuse device failed: No such file or directory
>
>####################gsyncd.log outputs:######################
>
>from slave:
>[2020-03-11 08:55:16.384085] I [repce(slave
>master-node/srv/media-storage):96:service_loop] RepceServer:
>terminating on
>reaching EOF.
>[2020-03-11 08:57:55.87364] I [resource(slave
>master-node/srv/media-storage):1105:connect] GLUSTER: Mounting gluster
>volume locally...
>[2020-03-11 08:57:56.171372] I [resource(slave
>master-node/srv/media-storage):1128:connect] GLUSTER: Mounted gluster
>volume duration=1.0837
>[2020-03-11 08:57:56.173346] I [resource(slave
>master-node/srv/media-storage):1155:service_loop] GLUSTER: slave
>listening
>
>from master:
>[2020-03-11 20:08:55.145453] I [master(worker
>/srv/media-storage):1991:syncjob] Syncer: Sync Time Taken
>duration=134.9987num_files=4661 job=2 return_code=0
>[2020-03-11 20:08:55.285871] I [master(worker
>/srv/media-storage):1421:process] _GMaster: Entry Time Taken MKD=83
>MKN=8109 LIN=0 SYM=0 REN=0 RMD=0 CRE=0 duration=17.0358 UNL=0
>[2020-03-11 20:08:55.286082] I [master(worker
>/srv/media-storage):1431:process] _GMaster: Data/Metadata Time Taken
>SETA=83 SETX=0 meta_duration=0.9334 data_duration=135.2497 DATA=8109
>XATT=0
>[2020-03-11 20:08:55.286410] I [master(worker
>/srv/media-storage):1441:process] _GMaster: Batch Completed
>changelog_end=1583917610 entry_stime=None changelog_start=1583917610
>stime=None duration=153.5185 num_changelogs=1 mode=xsync
>[2020-03-11 20:08:55.315442] I [master(worker
>/srv/media-storage):1681:crawl] _GMaster: processing xsync changelog
>path=/var/lib/misc/gluster/gsyncd/media-storage_daredevil01.zingat.com_dr-media/srv-media-storage/xsync/XSYNC-CHANGELOG.1583917613
>
>
>Thank you..
>
>Strahil Nikolov <hunter86_bg at yahoo.com>, 11 Mar 2020 ?ar, 12:28
>tarihinde
>?unu yazd?:
>
>> On March 11, 2020 10:09:27 AM GMT+02:00, "Etem Bayo?lu" <
>> etembayoglu at gmail.com> wrote:
>> >Hello community,
>> >
>> >I've set up a glusterfs geo-replication node for disaster
recovery.
>I
>> >manage about 10TB media data on a gluster volume and I want to sync
>all
>> >data to remote location over WAN. So, I created a slave node volume
>on
>> >disaster recovery center on remote location and I've started
geo-rep
>> >session. It has been transferred data fine up to about 800GB, but
>> >syncing
>> >has stopped for three days despite gluster geo-rep status active
and
>> >hybrid
>> >crawl. There is no sending data. I've recreated session and
>restarted
>> >but
>> >still the same.
>> >
>> >#gluster volu geo-rep status
>> >
>> >MASTER NODE            MASTER VOL       MASTER BRICK          SLAVE
>> >USER
>> >SLAVE                                     SLAVE NODE
>> >STATUS
>> >   CRAWL STATUS    LAST_SYNCED
>>
>>
>>------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> >master-node   media-storage    /srv/media-storage    root
>> > ssh://slave-node::dr-media    slave-node          Active
>> >Hybrid Crawl                 N/A
>> >
>> >Any idea? please. Thank you.
>>
>> Hi Etem,
>>
>> Have you checked the log on both source and destination. Maybe they
>can
>> hint you what the issue is.
>>
>> Best Regards,
>> Strahil Nikolov
>>
Hi Etem,

Nothing obvious....
I don't like this one:

[2020-03-11 19:53:32.721509] E>[fuse-bridge.c:227:check_and_dump_fuse_W]
>(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13a)[0x7f78e10488ea]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x8221)[0x7f78d83f6221]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x9998)[0x7f78d83f7998]
>(-->
>/lib64/libpthread.so.0(+0x7e65)[0x7f78dfe89e65] (-->
>/lib64/libc.so.6(clone+0x6d)[0x7f78df74f88d] ))))) 0-glusterfs-fuse:
>writing to fuse device failed: No such file or directory
Can you check the health of the slave volume (splitbrains, brick status,etc) ?

Maybe you can check the logs and find when exactly the master stopped
replicating and then checking the logs of the slave at that exact time .

Also, you can increase the log level on the slave and then recreate the georep.
For details, check: 

https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3/html/administration_guide/configuring_the_log_level

P.S.: Trace/debug can fill up your /var/log, so enable them for a short period
of time.

Best Regards,
Strahil Nikolov

Gluster users - Mar 2020 - geo-replication sync issue

[Gluster-users] geo-replication sync issue

[Gluster-users] geo-replication sync issue