thr3ads.net - Gluster users - [Gluster-users] [geo-rep] Replication faulty - gsyncd.log OSError: [Errno 13] Permission denied [Mar 2019]

If this information is useful, please help other people find it:
Share via:

Andy Coates

2019-Mar-20 06:21 UTC

[Gluster-users] [geo-rep] Replication faulty - gsyncd.log OSError: [Errno 13] Permission denied

We're seeing the same permission denied errors when running as a non-root
geosync user.

Does anyone know what the underlying issue is?

On Wed, 26 Sep 2018 at 00:40, Kotte, Christian (Ext) <
christian.kotte at novartis.com> wrote:
> I changed to replication to use the root user and re-created the
> replication with ?create force?. Now the files and folders were replicated,
> and the permission denied, and New folder error disappeared, but old files
> are not deleted.
>
>
>
> Looks like the history crawl is in some kind of a loop:
>
>
>
> [root at master ~]# gluster volume geo-replication status
>
>
>
> MASTER NODE                         MASTER VOL     MASTER BRICK
> SLAVE USER    SLAVE                                                  SLAVE
> NODE                          STATUS    CRAWL STATUS    LAST_SYNCED
>
>
>
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> master                              glustervol1    /bricks/brick1/brick
> root          ssh://slave_1::glustervol1
> slave_1                             Active    Hybrid Crawl    N/A
>
> master                              glustervol1    /bricks/brick1/brick
> root          ssh://slave_2::glustervol1
> slave_2                             Active    Hybrid Crawl    N/A
>
> master                              glustervol1    /bricks/brick1/brick
> root          ssh://slave_3::glustervol1
> slave_3                             Active    Hybrid Crawl    N/A
>
>
>
> tail -f
>
/var/log/glusterfs/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.log
>
>   File
"/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line
> 104, in cl_history_changelog
>
>     raise ChangelogHistoryNotAvailable()
>
> ChangelogHistoryNotAvailable
>
> [2018-09-25 14:10:44.196011] E [repce(worker
> /bricks/brick1/brick):197:__call__] RepceClient: call failed
> call=29945:139700517484352:1537884644.19  method=history
> error=ChangelogHistoryNotAvailable
>
> [2018-09-25 14:10:44.196405] I [resource(worker
> /bricks/brick1/brick):1295:service_loop] GLUSTER: Changelog history not
> available, using xsync
>
> [2018-09-25 14:10:44.221385] I [master(worker
> /bricks/brick1/brick):1623:crawl] _GMaster: starting hybrid crawl
> stime=(0, 0)
>
> [2018-09-25 14:10:44.223382] I [gsyncdstatus(worker
> /bricks/brick1/brick):249:set_worker_crawl_status] GeorepStatus: Crawl
> Status Change      status=Hybrid Crawl
>
> [2018-09-25 14:10:46.225296] I [master(worker
> /bricks/brick1/brick):1634:crawl] _GMaster: processing xsync changelog
>
path=/var/lib/misc/gluster/gsyncd/glustervol1_slave_3_glustervol1/bricks-brick1-brick/xsync/XSYNC-CHANGELOG.1537884644
>
> [2018-09-25 14:13:36.157408] I [gsyncd(config-get):297:main] <top>:
Using
> session config file
>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf
>
> [2018-09-25 14:13:36.377880] I [gsyncd(status):297:main] <top>: Using
> session config file
>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf
>
> [2018-09-25 14:31:10.145035] I [master(worker
> /bricks/brick1/brick):1944:syncjob] Syncer: Sync Time Taken
> duration=1212.5316      num_files=1474    job=2 return_code=11
>
> [2018-09-25 14:31:10.152637] E [syncdutils(worker
> /bricks/brick1/brick):801:errlog] Popen: command returned error cmd=rsync
> -aR0 --inplace --files-from=- --super --stats --numeric-ids
> --no-implied-dirs --xattrs --acls . -e ssh -oPasswordAuthentication=no
> -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem
> -p 22 -oControlMaster=auto -S
> /tmp/gsyncd-aux-ssh-gg758Z/caec4d1d03cc28bc1853f692e291164f.sock
> slave_3:/proc/15919/cwd error=11
>
> [2018-09-25 14:31:10.237371] I [repce(agent
> /bricks/brick1/brick):80:service_loop] RepceServer: terminating on reaching
> EOF.
>
> [2018-09-25 14:31:10.430820] I
> [gsyncdstatus(monitor):244:set_worker_status] GeorepStatus: Worker Status
> Change  status=Faulty
>
> [2018-09-25 14:31:20.541475] I [monitor(monitor):158:monitor] Monitor:
> starting gsyncd worker      brick=/bricks/brick1/brick   
slave_node=slave_3
>
> [2018-09-25 14:31:20.806518] I [gsyncd(agent
> /bricks/brick1/brick):297:main] <top>: Using session config file
>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf
>
> [2018-09-25 14:31:20.816536] I [changelogagent(agent
> /bricks/brick1/brick):72:__init__] ChangelogAgent: Agent listining...
>
> [2018-09-25 14:31:20.821574] I [gsyncd(worker
> /bricks/brick1/brick):297:main] <top>: Using session config file
>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf
>
> [2018-09-25 14:31:20.882128] I [resource(worker
> /bricks/brick1/brick):1377:connect_remote] SSH: Initializing SSH connection
> between master and slave...
>
> [2018-09-25 14:31:24.169857] I [resource(worker
> /bricks/brick1/brick):1424:connect_remote] SSH: SSH connection between
> master and slave established.      duration=3.2873
>
> [2018-09-25 14:31:24.170401] I [resource(worker
> /bricks/brick1/brick):1096:connect] GLUSTER: Mounting gluster volume
> locally...
>
> [2018-09-25 14:31:25.354633] I [resource(worker
> /bricks/brick1/brick):1119:connect] GLUSTER: Mounted gluster volume
> duration=1.1839
>
> [2018-09-25 14:31:25.355073] I [subcmds(worker
> /bricks/brick1/brick):70:subcmd_worker] <top>: Worker spawn
successful.
> Acknowledging back to monitor
>
> [2018-09-25 14:31:27.439034] I [master(worker
> /bricks/brick1/brick):1593:register] _GMaster: Working dir
>
path=/var/lib/misc/gluster/gsyncd/glustervol1_slave_3_glustervol1/bricks-brick1-brick
>
> [2018-09-25 14:31:27.441847] I [resource(worker
> /bricks/brick1/brick):1282:service_loop] GLUSTER: Register time
> time=1537885887
>
> [2018-09-25 14:31:27.465053] I [gsyncdstatus(worker
> /bricks/brick1/brick):277:set_active] GeorepStatus: Worker Status
> Change      status=Active
>
> [2018-09-25 14:31:27.471021] I [gsyncdstatus(worker
> /bricks/brick1/brick):249:set_worker_crawl_status] GeorepStatus: Crawl
> Status Change      status=History Crawl
>
> [2018-09-25 14:31:27.471484] I [master(worker
> /bricks/brick1/brick):1507:crawl] _GMaster: starting history crawl
> turns=1     stime=(0, 0)      entry_stime=None      etime=1537885887
>
> [2018-09-25 14:31:27.472564] E [repce(agent
> /bricks/brick1/brick):105:worker] <top>: call failed:
>
> Traceback (most recent call last):
>
>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line
101, in
> worker
>
>     res = getattr(self.obj, rmeth)(*in_data[2:])
>
>   File
"/usr/libexec/glusterfs/python/syncdaemon/changelogagent.py", line
> 53, in history
>
>     num_parallel)
>
>   File
"/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line
> 104, in cl_history_changelog
>
>     raise ChangelogHistoryNotAvailable()
>
> ChangelogHistoryNotAvailable
>
> [2018-09-25 14:31:27.480632] E [repce(worker
> /bricks/brick1/brick):197:__call__] RepceClient: call failed
> call=31250:140272364406592:1537885887.47  method=history
> error=ChangelogHistoryNotAvailable
>
> [2018-09-25 14:31:27.480958] I [resource(worker
> /bricks/brick1/brick):1295:service_loop] GLUSTER: Changelog history not
> available, using xsync
>
> [2018-09-25 14:31:27.495117] I [master(worker
> /bricks/brick1/brick):1623:crawl] _GMaster: starting hybrid crawl
> stime=(0, 0)
>
> [2018-09-25 14:31:27.502083] I [gsyncdstatus(worker
> /bricks/brick1/brick):249:set_worker_crawl_status] GeorepStatus: Crawl
> Status Change      status=Hybrid Crawl
>
> [2018-09-25 14:31:29.505284] I [master(worker
> /bricks/brick1/brick):1634:crawl] _GMaster: processing xsync changelog
>
path=/var/lib/misc/gluster/gsyncd/glustervol1_slave_3_glustervol1/bricks-brick1-brick/xsync/XSYNC-CHANGELOG.1537885887
>
>
>
> tail -f
>
/var/log/glusterfs/geo-replication-slaves/glustervol1_slave_3_glustervol1/gsyncd.log
>
> [2018-09-25 13:49:24.141303] I [repce(slave
> master/bricks/brick1/brick):80:service_loop] RepceServer: terminating on
> reaching EOF.
>
> [2018-09-25 13:49:36.602051] W [gsyncd(slave
> master/bricks/brick1/brick):293:main] <top>: Session config file not
> exists, using the default config
>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf
>
> [2018-09-25 13:49:36.629415] I [resource(slave
> master/bricks/brick1/brick):1096:connect] GLUSTER: Mounting gluster volume
> locally...
>
> [2018-09-25 13:49:37.701642] I [resource(slave
> master/bricks/brick1/brick):1119:connect] GLUSTER: Mounted gluster
> volume     duration=1.0718
>
> [2018-09-25 13:49:37.704282] I [resource(slave
> master/bricks/brick1/brick):1146:service_loop] GLUSTER: slave listening
>
> [2018-09-25 14:10:27.70952] I [repce(slave
> master/bricks/brick1/brick):80:service_loop] RepceServer: terminating on
> reaching EOF.
>
> [2018-09-25 14:10:39.632124] W [gsyncd(slave
> master/bricks/brick1/brick):293:main] <top>: Session config file not
> exists, using the default config
>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf
>
> [2018-09-25 14:10:39.650958] I [resource(slave
> master/bricks/brick1/brick):1096:connect] GLUSTER: Mounting gluster volume
> locally...
>
> [2018-09-25 14:10:40.729355] I [resource(slave
> master/bricks/brick1/brick):1119:connect] GLUSTER: Mounted gluster
> volume     duration=1.0781
>
> [2018-09-25 14:10:40.730650] I [resource(slave
> master/bricks/brick1/brick):1146:service_loop] GLUSTER: slave listening
>
> [2018-09-25 14:31:10.291064] I [repce(slave
> master/bricks/brick1/brick):80:service_loop] RepceServer: terminating on
> reaching EOF.
>
> [2018-09-25 14:31:22.802237] W [gsyncd(slave
> master/bricks/brick1/brick):293:main] <top>: Session config file not
> exists, using the default config
>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf
>
> [2018-09-25 14:31:22.828418] I [resource(slave
> master/bricks/brick1/brick):1096:connect] GLUSTER: Mounting gluster volume
> locally...
>
> [2018-09-25 14:31:23.910206] I [resource(slave
> master/bricks/brick1/brick):1119:connect] GLUSTER: Mounted gluster
> volume     duration=1.0813
>
> [2018-09-25 14:31:23.913369] I [resource(slave
> master/bricks/brick1/brick):1146:service_loop] GLUSTER: slave listening
>
>
>
> Any ideas how to resolve this without re-creating everything again? Can I
> reset the changelog history?
>
>
>
> Regards,
>
> Christian
>
>
>
> *From: *<gluster-users-bounces at gluster.org> on behalf of
"Kotte,
> Christian (Ext)" <christian.kotte at novartis.com>
> *Date: *Monday, 24. September 2018 at 17:20
> *To: *Kotresh Hiremath Ravishankar <khiremat at redhat.com>
> *Cc: *Gluster Users <gluster-users at gluster.org>
> *Subject: *Re: [Gluster-users] [geo-rep] Replication faulty - gsyncd.log
> OSError: [Errno 13] Permission denied
>
>
>
> I don?t configure the permissions of /bricks/brick1/brick/.glusterfs. I
> don?t even see it on the local GlusterFS mount.
>
>
>
> Not sure why the permissions are configured with S and the AD group?
>
>
>
> Regards,
>
> Christian
>
>
>
> *From: *<gluster-users-bounces at gluster.org> on behalf of
"Kotte,
> Christian (Ext)" <christian.kotte at novartis.com>
> *Date: *Monday, 24. September 2018 at 17:10
> *To: *Kotresh Hiremath Ravishankar <khiremat at redhat.com>
> *Cc: *Gluster Users <gluster-users at gluster.org>
> *Subject: *Re: [Gluster-users] [geo-rep] Replication faulty - gsyncd.log
> OSError: [Errno 13] Permission denied
>
>
>
> Yeah right. I get permission denied.
>
>
>
> [geoaccount at slave ~]$ ll
> /bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e
>
> ls: cannot access
> /bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e:
> Permission denied
>
> [geoaccount at slave ~]$ ll /bricks/brick1/brick/.glusterfs/29/d1/
>
> ls: cannot access /bricks/brick1/brick/.glusterfs/29/d1/: Permission denied
>
> [geoaccount at slave ~]$ ll /bricks/brick1/brick/.glusterfs/29/
>
> ls: cannot access /bricks/brick1/brick/.glusterfs/29/: Permission denied
>
> [geoaccount at slave ~]$ ll /bricks/brick1/brick/.glusterfs/
>
> ls: cannot open directory /bricks/brick1/brick/.glusterfs/: Permission
> denied
>
>
>
> [root at slave ~]# ll /bricks/brick1/brick/.glusterfs/29
>
> total 0
>
> drwx--S---+ 2 root AD+group 50 Sep 10 07:29 16
>
> drwx--S---+ 2 root AD+group 50 Sep 10 07:29 33
>
> drwx--S---+ 2 root AD+group 50 Sep 10 07:29 5e
>
> drwx--S---+ 2 root AD+group 50 Sep 10 07:29 73
>
> drwx--S---+ 2 root AD+group 50 Sep 10 07:29 b2
>
> drwx--S---+ 2 root AD+group 50 Sep 21 09:39 d1
>
> drwx--S---+ 2 root AD+group 50 Sep 10 07:29 d7
>
> drwx--S---+ 2 root AD+group 50 Sep 10 07:29 e6
>
> drwx--S---+ 2 root AD+group 50 Sep 10 07:29 eb
>
> [root at slave ~]#
>
>
>
> However, the strange thing is that I could replicate new files and folders
> before. The replication is broken since the ?New folder? was created.
>
>
>
> These are the permissions on a dev/test system:
>
> [root at slave-dev ~]# ll /bricks/brick1/brick/.glusterfs/
>
> total 3136
>
> drwx------. 44 root root    4096 Aug 22 18:19 00
>
> drwx------. 50 root root    4096 Sep 12 13:14 01
>
> drwx------. 54 root root    4096 Sep 13 11:33 02
>
> drwx------. 59 root root    4096 Aug 22 18:21 03
>
> drwx------. 60 root root    4096 Sep 12 13:14 04
>
> drwx------. 68 root root    4096 Aug 24 12:36 05
>
> drwx------. 56 root root    4096 Aug 22 18:21 06
>
> drwx------. 46 root root    4096 Aug 22 18:21 07
>
> drwx------. 51 root root    4096 Aug 22 18:21 08
>
> drwx------. 42 root root    4096 Aug 22 18:21 09
>
> drwx------. 44 root root    4096 Sep 13 11:16 0a
>
>
>
> I?ve configured an AD group, SGID bit, and ACLs via Ansible on the local
> mount point. Could this be an issue? Should I avoid configuring the
> permissions on .glusterfs and below?
>
>
>
> # ll /mnt/glustervol1/
>
> total 12
>
> drwxrwsr-x+  4 AD+user AD+group 4096 Jul 13 07:46 Scripts
>
> drwxrwxr-x+ 10 AD+user AD+group 4096 Jun 12 12:03 Software
>
> -rw-rw-r--+  1 root    AD+group    0 Aug  8 08:44 test
>
> drwxr-xr-x+  6 AD+user AD+group 4096 Apr 18 10:58 tftp
>
>
>
> glusterfs_volumes:
>
> [?]
>
>     permissions:
>
>       mode: "02775"
>
>       owner: root
>
>       group: "AD+group"
>
>       acl_permissions: rw
>
> [?]
>
>
>
> # root directory is owned by root.
>
> # set permissions to 'g+s' to automatically set the group to
"AD+group"
>
> # permissions of individual files will be set by Samba during creation
>
> - name: Configure volume directory permission 1/2
>
>   tags: glusterfs
>
>   file:
>
>     path: /mnt/{{ item.volume }}
>
>     state: directory
>
>     mode: "{{ item.permissions.mode }}"
>
>     owner: "{{ item.permissions.owner }}"
>
>     group: "{{ item.permissions.group }}"
>
>   with_items: "{{ glusterfs_volumes }}"
>
>   loop_control:
>
>     label: "{{ item.volume }}"
>
>   when: item.permissions is defined
>
>
>
> # ACL needs to be set to override default umask and grant
"AD+group" write
> permissions
>
> - name: Configure volume directory permission 2/2 (ACL)
>
>   tags: glusterfs
>
>   acl:
>
>     path: /mnt/{{ item.volume }}
>
>     default: yes
>
>     entity: "{{ item.permissions.group }}"
>
>     etype: group
>
>     permissions: "{{ item.permissions.acl_permissions }}"
>
>     state: present
>
>   with_items: "{{ glusterfs_volumes }}"
>
>   loop_control:
>
>     label: "{{ item.volume }}"
>
>   when: item.permissions is defined
>
>
>
> Regards,
>
> Christian
>
>
>
> *From: *Kotresh Hiremath Ravishankar <khiremat at redhat.com>
> *Date: *Monday, 24. September 2018 at 16:20
> *To: *"Kotte, Christian (Ext)" <christian.kotte at
novartis.com>
> *Cc: *Gluster Users <gluster-users at gluster.org>
> *Subject: *Re: [Gluster-users] [geo-rep] Replication faulty - gsyncd.log
> OSError: [Errno 13] Permission denied
>
>
>
> I think I am get what's happening. The geo-rep session is non-root.
>
> Could you do readlink on brick path mentioned above
> /bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e
>
> from a geaccount user and see if you are getting "Permission
Denied"
> errors?
>
>
>
> Thanks,
>
> Kotresh HR
>
>
>
> On Mon, Sep 24, 2018 at 7:35 PM Kotte, Christian (Ext) <
> christian.kotte at novartis.com> wrote:
>
> Ok. It happens on all slave nodes (and on the interimmaster as well).
>
>
>
> It?s like I assumed. These are the logs of one of the slaves:
>
>
>
> gsyncd.log
>
> [2018-09-24 13:52:25.418382] I [repce(slave
> slave/bricks/brick1/brick):80:service_loop] RepceServer: terminating on
> reaching EOF.
>
> [2018-09-24 13:52:37.95297] W [gsyncd(slave
> slave/bricks/brick1/brick):293:main] <top>: Session config file not
exists,
> using the default config
>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_glustervol1/gsyncd.conf
>
> [2018-09-24 13:52:37.109643] I [resource(slave
> slave/bricks/brick1/brick):1096:connect] GLUSTER: Mounting gluster volume
> locally...
>
> [2018-09-24 13:52:38.303920] I [resource(slave
> slave/bricks/brick1/brick):1119:connect] GLUSTER: Mounted gluster
> volume      duration=1.1941
>
> [2018-09-24 13:52:38.304771] I [resource(slave
> slave/bricks/brick1/brick):1146:service_loop] GLUSTER: slave listening
>
> [2018-09-24 13:52:41.981554] I [resource(slave
> slave/bricks/brick1/brick):598:entry_ops] <top>: Special case: rename
on
> mkdir        gfid=29d1d60d-1ad6-45fc-87e0-93d478f7331e
> entry='.gfid/6b97b987-8aef-46c3-af27-20d3aa883016/New folder'
>
> [2018-09-24 13:52:42.45641] E [repce(slave
> slave/bricks/brick1/brick):105:worker] <top>: call failed:
>
> Traceback (most recent call last):
>
>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line
101, in
> worker
>
>     res = getattr(self.obj, rmeth)(*in_data[2:])
>
>   File "/usr/libexec/glusterfs/python/syncdaemon/resource.py",
line 599,
> in entry_ops
>
>     src_entry = get_slv_dir_path(slv_host, slv_volume, gfid)
>
>   File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py",
line 682,
> in get_slv_dir_path
>
>     [ENOENT], [ESTALE])
>
>   File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py",
line 540,
> in errno_wrap
>
>     return call(*arg)
>
> OSError: [Errno 13] Permission denied:
>
'/bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e'
>
> [2018-09-24 13:52:42.81794] I [repce(slave
> slave/bricks/brick1/brick):80:service_loop] RepceServer: terminating on
> reaching EOF.
>
> [2018-09-24 13:52:53.459676] W [gsyncd(slave
> slave/bricks/brick1/brick):293:main] <top>: Session config file not
exists,
> using the default config
>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_glustervol1/gsyncd.conf
>
> [2018-09-24 13:52:53.473500] I [resource(slave
> slave/bricks/brick1/brick):1096:connect] GLUSTER: Mounting gluster volume
> locally...
>
> [2018-09-24 13:52:54.659044] I [resource(slave
> slave/bricks/brick1/brick):1119:connect] GLUSTER: Mounted gluster volume
>    duration=1.1854
>
> [2018-09-24 13:52:54.659837] I [resource(slave
> slave/bricks/brick1/brick):1146:service_loop] GLUSTER: slave listening
>
>
>
> The folder ?New folder? will be created via Samba and it was renamed by my
> colleague right away after creation.
>
> [root at slave glustervol1_slave_glustervol1]# ls
> /bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e/
>
> [root at slave glustervol1_slave_glustervol1]# ls
> /bricks/brick1/brick/.glusterfs/29/d1/ -al
>
> total 0
>
> drwx--S---+  2 root AD+group 50 Sep 21 09:39 .
>
> drwx--S---+ 11 root AD+group 96 Sep 21 09:39 ..
>
> lrwxrwxrwx.  1 root AD+group 75 Sep 21 09:39
> 29d1d60d-1ad6-45fc-87e0-93d478f7331e ->
> ../../6b/97/6b97b987-8aef-46c3-af27-20d3aa883016/vRealize Operation Manager
>
>
>
> Creating the folder in
>
/bricks/brick1/brick/.glusterfs/6b/97/6b97b987-8aef-46c3-af27-20d3aa883016/,
> but it didn?t change anything.
>
>
>
> mnt-slave-bricks-brick1-brick.log
>
> [2018-09-24 13:51:10.625723] W [rpc-clnt.c:1753:rpc_clnt_submit]
> 0-glustervol1-client-0: error returned while attempting to connect to
> host:(null), port:0
>
> [2018-09-24 13:51:10.626092] W [rpc-clnt.c:1753:rpc_clnt_submit]
> 0-glustervol1-client-0: error returned while attempting to connect to
> host:(null), port:0
>
> [2018-09-24 13:51:10.626181] I [rpc-clnt.c:2105:rpc_clnt_reconfig]
> 0-glustervol1-client-0: changing port to 49152 (from 0)
>
> [2018-09-24 13:51:10.643111] W [rpc-clnt.c:1753:rpc_clnt_submit]
> 0-glustervol1-client-0: error returned while attempting to connect to
> host:(null), port:0
>
> [2018-09-24 13:51:10.643489] W [dict.c:923:str_to_data]
> (-->/usr/lib64/glusterfs/4.1.3/xlator/protocol/client.so(+0x4131a)
> [0x7fafb023831a] -->/lib64/libglusterfs.so.0(dict_set_str+0x16)
> [0x7fafbdb83266] -->/lib64/libglusterfs.so.0(str_to_data+0x91)
> [0x7fafbdb7fea1] ) 0-dict: value is NULL [Invalid argument]
>
> [2018-09-24 13:51:10.643507] I [MSGID: 114006]
> [client-handshake.c:1308:client_setvolume] 0-glustervol1-client-0: failed
> to set process-name in handshake msg
>
> [2018-09-24 13:51:10.643541] W [rpc-clnt.c:1753:rpc_clnt_submit]
> 0-glustervol1-client-0: error returned while attempting to connect to
> host:(null), port:0
>
> [2018-09-24 13:51:10.671460] I [MSGID: 114046]
> [client-handshake.c:1176:client_setvolume_cbk] 0-glustervol1-client-0:
> Connected to glustervol1-client-0, attached to remote volume
> '/bricks/brick1/brick'.
>
> [2018-09-24 13:51:10.672694] I [fuse-bridge.c:4294:fuse_init]
> 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel
> 7.22
>
> [2018-09-24 13:51:10.672715] I [fuse-bridge.c:4927:fuse_graph_sync]
> 0-fuse: switched to graph 0
>
> [2018-09-24 13:51:10.673329] I [MSGID: 109005]
> [dht-selfheal.c:2342:dht_selfheal_directory] 0-glustervol1-dht: Directory
> selfheal failed: Unable to form layout for directory /
>
> [2018-09-24 13:51:16.116458] I [fuse-bridge.c:5199:fuse_thread_proc]
> 0-fuse: initating unmount of
> /var/mountbroker-root/user1300/mtpt-geoaccount-ARDW1E
>
> [2018-09-24 13:51:16.116595] W [glusterfsd.c:1514:cleanup_and_exit]
> (-->/lib64/libpthread.so.0(+0x7e25) [0x7fafbc9eee25]
> -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x55d5dac5dd65]
> -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x55d5dac5db8b] ) 0-:
> received signum (15), shutting down
>
> [2018-09-24 13:51:16.116616] I [fuse-bridge.c:5981:fini] 0-fuse:
> Unmounting '/var/mountbroker-root/user1300/mtpt-geoaccount-ARDW1E'.
>
> [2018-09-24 13:51:16.116625] I [fuse-bridge.c:5986:fini] 0-fuse: Closing
> fuse connection to
'/var/mountbroker-root/user1300/mtpt-geoaccount-ARDW1E'.
>
>
>
> Regards,
>
> Christian
>
>
>
> *From: *Kotresh Hiremath Ravishankar <khiremat at redhat.com>
> *Date: *Saturday, 22. September 2018 at 06:52
> *To: *"Kotte, Christian (Ext)" <christian.kotte at
novartis.com>
> *Cc: *Gluster Users <gluster-users at gluster.org>
> *Subject: *Re: [Gluster-users] [geo-rep] Replication faulty - gsyncd.log
> OSError: [Errno 13] Permission denied
>
>
>
> The problem occured on slave side whose error is propagated to master.
> Mostly any traceback with repce involved is related to problem in slave.
> Just check few lines above in the log to find the slave node, the crashed
> worker is connected to and get geo replication logs to further debug.
>
>
>
>
>
>
>
>
>
>
>
> On Fri, 21 Sep 2018, 20:10 Kotte, Christian (Ext), <
> christian.kotte at novartis.com> wrote:
>
> Hi,
>
>
>
> Any idea how to troubleshoot this?
>
>
>
> New folders and files were created on the master and the replication went
> faulty. They were created via Samba.
>
>
>
> Version: GlusterFS 4.1.3
>
>
>
> [root at master]# gluster volume geo-replication status
>
>
>
> MASTER NODE                         MASTER VOL     MASTER BRICK
> SLAVE USER
> SLAVE                                                             SLAVE
> NODE    STATUS    CRAWL STATUS    LAST_SYNCED
>
>
>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> master    glustervol1    /bricks/brick1/brick    geoaccount
> ssh://geoaccount at slave_1::glustervol1       N/A           Faulty
> N/A             N/A
>
> master    glustervol1    /bricks/brick1/brick    geoaccount
> ssh://geoaccount at slave_2::glustervol1       N/A           Faulty
> N/A             N/A
>
> master    glustervol1    /bricks/brick1/brick    geoaccount
> ssh://geoaccount at interimmaster::glustervol1   N/A           Faulty
> N/A             N/A
>
>
>
> The following error is repeatedly logged in the gsyncd.logs:
>
> [2018-09-21 14:26:38.611479] I [repce(agent
> /bricks/brick1/brick):80:service_loop] RepceServer: terminating on reaching
> EOF.
>
> [2018-09-21 14:26:39.211527] I [monitor(monitor):279:monitor] Monitor:
> worker died in startup phase     brick=/bricks/brick1/brick
>
> [2018-09-21 14:26:39.214322] I
> [gsyncdstatus(monitor):244:set_worker_status] GeorepStatus: Worker Status
> Change status=Faulty
>
> [2018-09-21 14:26:49.318953] I [monitor(monitor):158:monitor] Monitor:
> starting gsyncd worker   brick=/bricks/brick1/brick      slave_node>
slave_3 <http://nrchbs-slp2020.nibr.novartis.net>
>
> [2018-09-21 14:26:49.471532] I [gsyncd(agent
> /bricks/brick1/brick):297:main] <top>: Using session config file
>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf
>
> [2018-09-21 14:26:49.473917] I [changelogagent(agent
> /bricks/brick1/brick):72:__init__] ChangelogAgent: Agent listining...
>
> [2018-09-21 14:26:49.491359] I [gsyncd(worker
> /bricks/brick1/brick):297:main] <top>: Using session config file
>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf
>
> [2018-09-21 14:26:49.538049] I [resource(worker
> /bricks/brick1/brick):1377:connect_remote] SSH: Initializing SSH connection
> between master and slave...
>
> [2018-09-21 14:26:53.5017] I [resource(worker
> /bricks/brick1/brick):1424:connect_remote] SSH: SSH connection between
> master and slave established.      duration=3.4665
>
> [2018-09-21 14:26:53.5419] I [resource(worker
> /bricks/brick1/brick):1096:connect] GLUSTER: Mounting gluster volume
> locally...
>
> [2018-09-21 14:26:54.120374] I [resource(worker
> /bricks/brick1/brick):1119:connect] GLUSTER: Mounted gluster volume
> duration=1.1146
>
> [2018-09-21 14:26:54.121012] I [subcmds(worker
> /bricks/brick1/brick):70:subcmd_worker] <top>: Worker spawn
successful.
> Acknowledging back to monitor
>
> [2018-09-21 14:26:56.144460] I [master(worker
> /bricks/brick1/brick):1593:register] _GMaster: Working dir
>
path=/var/lib/misc/gluster/gsyncd/glustervol1_slave_3_glustervol1/bricks-brick1-brick
>
> [2018-09-21 14:26:56.145145] I [resource(worker
> /bricks/brick1/brick):1282:service_loop] GLUSTER: Register time
> time=1537540016
>
> [2018-09-21 14:26:56.160064] I [gsyncdstatus(worker
> /bricks/brick1/brick):277:set_active] GeorepStatus: Worker Status Change
> status=Active
>
> [2018-09-21 14:26:56.161175] I [gsyncdstatus(worker
> /bricks/brick1/brick):249:set_worker_crawl_status] GeorepStatus: Crawl
> Status Change        status=History Crawl
>
> [2018-09-21 14:26:56.161536] I [master(worker
> /bricks/brick1/brick):1507:crawl] _GMaster: starting history crawl
> turns=1 stime=(1537522637, 0)   entry_stime=(1537537141, 0)
> etime=1537540016
>
> [2018-09-21 14:26:56.164277] I [master(worker
> /bricks/brick1/brick):1536:crawl] _GMaster: slave's time
> stime=(1537522637, 0)
>
> [2018-09-21 14:26:56.197065] I [master(worker
> /bricks/brick1/brick):1360:process] _GMaster: Skipping already processed
> entry ops        to_changelog=1537522638 num_changelogs=1
> from_changelog=1537522638
>
> [2018-09-21 14:26:56.197402] I [master(worker
> /bricks/brick1/brick):1374:process] _GMaster: Entry Time Taken    MKD=0
> MKN=0   LIN=0   SYM=0   REN=0   RMD=0   CRE=0   duration=0.0000 UNL=1
>
> [2018-09-21 14:26:56.197623] I [master(worker
> /bricks/brick1/brick):1384:process] _GMaster: Data/Metadata Time Taken
> SETA=0  SETX=0  meta_duration=0.0000    data_duration=0.0284    DATA=0
> XATT=0
>
> [2018-09-21 14:26:56.198230] I [master(worker
> /bricks/brick1/brick):1394:process] _GMaster: Batch Completed
> changelog_end=1537522638        entry_stime=(1537537141, 0)
> changelog_start=1537522638      stime=(1537522637, 0)   duration=0.0333
> num_changelogs=1        mode=history_changelog
>
> [2018-09-21 14:26:57.200436] I [master(worker
> /bricks/brick1/brick):1536:crawl] _GMaster: slave's time
> stime=(1537522637, 0)
>
> [2018-09-21 14:26:57.528625] E [repce(worker
> /bricks/brick1/brick):197:__call__] RepceClient: call failed
> call=17209:140650361157440:1537540017.21        method=entry_ops
> error=OSError
>
> [2018-09-21 14:26:57.529371] E [syncdutils(worker
> /bricks/brick1/brick):332:log_raise_exception] <top>: FAIL:
>
> Traceback (most recent call last):
>
>   File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line
311, in
> main
>
>     func(args)
>
>   File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py",
line 72, in
> subcmd_worker
>
>     local.service_loop(remote)
>
>   File "/usr/libexec/glusterfs/python/syncdaemon/resource.py",
line 1288,
> in service_loop
>
>     g3.crawlwrap(oneshot=True)
>
>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line
615, in
> crawlwrap
>
>     self.crawl()
>
>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line
1545, in
> crawl
>
>     self.changelogs_batch_process(changes)
>
>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line
1445, in
> changelogs_batch_process
>
>     self.process(batch)
>
>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line
1280, in
> process
>
>     self.process_change(change, done, retry)
>
>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line
1179, in
> process_change
>
>     failures = self.slave.server.entry_ops(entries)
>
>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line
216, in
> __call__
>
>     return self.ins(self.meth, *a)
>
>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line
198, in
> __call__
>
>     raise res
>
> OSError: [Errno 13] Permission denied:
>
'/bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e'
>
>
>
> The permissions look fine. The replication is done via geo user instead of
> root. It should be able to read, but I?m not sure if the syncdaemon runs
> under geoaccount!?
>
>
>
> [root at master vRealize Operation Manager]# ll
> /bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e
>
> lrwxrwxrwx. 1 root root 75 Sep 21 09:39
> /bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e
> -> ../../6b/97/6b97b987-8aef-46c3-af27-20d3aa883016/vRealize Operation
> Manager
>
>
>
> [root at master vRealize Operation Manager]# ll
> /bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e/
>
> total 4
>
> drwxrwxr-x. 2 AD+user AD+group  131 Sep 21 10:14 6.7
>
> drwxrwxr-x. 2 AD+user AD+group 4096 Sep 21 09:43 7.0
>
> drwxrwxr-x. 2 AD+user AD+group   57 Sep 21 10:28 7.5
>
> [root at master vRealize Operation Manager]#
>
>
>
> It could be possible that the folder was renamed. I had 3 similar issues
> since I migrated to GlusterFS 4.x but couldn?t investigate much. I needed
> to completely wipe GlusterFS and geo-repliction to get rid of this error?
>
>
>
> Any help is appreciated.
>
>
>
> Regards,
>
>
>
> *Christian Kotte*
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
<https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.gluster.org_mailman_listinfo_gluster-2Dusers&d=DwMFaQ&c=ZbgFmJjg4pdtrnL2HUJUDw&r=faVOd9yfnSYhe2mQhqlDwcpXGm7x8HN1C9wPmFD3694&m=buld78OSs9O-NEZ-w9vywUcr-bP6_RTbL2pwat-zRIU&s=bKc1d7zoIXuVSLbZS_vD3v4-FJrG2I6T6Dhcq8Qk6Bs&e=>
>
>
>
> --
>
> Thanks and Regards,
>
> Kotresh H R
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190320/b66c650e/attachment-0001.html>

Kotresh Hiremath Ravishankar

2019-Mar-20 06:42 UTC

head link

[Gluster-users] [geo-rep] Replication faulty - gsyncd.log OSError: [Errno 13] Permission denied

Hi Andy,

This is a issue with non-root geo-rep session and is not fixed yet. Could
you please raise a bug for this issue?

Thanks,
Kotresh HR

On Wed, Mar 20, 2019 at 11:53 AM Andy Coates <andy.coates at gmail.com>
wrote:
> We're seeing the same permission denied errors when running as a
non-root
> geosync user.
>
> Does anyone know what the underlying issue is?
>
> On Wed, 26 Sep 2018 at 00:40, Kotte, Christian (Ext) <
> christian.kotte at novartis.com> wrote:
>
>> I changed to replication to use the root user and re-created the
>> replication with ?create force?. Now the files and folders were
replicated,
>> and the permission denied, and New folder error disappeared, but old
files
>> are not deleted.
>>
>>
>>
>> Looks like the history crawl is in some kind of a loop:
>>
>>
>>
>> [root at master ~]# gluster volume geo-replication status
>>
>>
>>
>> MASTER NODE                         MASTER VOL     MASTER
>> BRICK            SLAVE USER    SLAVE
>>                                            SLAVE
>> NODE                          STATUS    CRAWL STATUS    LAST_SYNCED
>>
>>
>>
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>> master                              glustervol1
>> /bricks/brick1/brick    root          ssh://slave_1::glustervol1
>>                               slave_1                            
Active
>> Hybrid Crawl    N/A
>>
>> master                              glustervol1
>> /bricks/brick1/brick    root          ssh://slave_2::glustervol1
>>                               slave_2                            
Active
>> Hybrid Crawl    N/A
>>
>> master                              glustervol1
>> /bricks/brick1/brick    root          ssh://slave_3::glustervol1
>>                               slave_3                            
Active
>> Hybrid Crawl    N/A
>>
>>
>>
>> tail -f
>>
/var/log/glusterfs/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.log
>>
>>   File
"/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line
>> 104, in cl_history_changelog
>>
>>     raise ChangelogHistoryNotAvailable()
>>
>> ChangelogHistoryNotAvailable
>>
>> [2018-09-25 14:10:44.196011] E [repce(worker
>> /bricks/brick1/brick):197:__call__] RepceClient: call failed
>> call=29945:139700517484352:1537884644.19  method=history
>> error=ChangelogHistoryNotAvailable
>>
>> [2018-09-25 14:10:44.196405] I [resource(worker
>> /bricks/brick1/brick):1295:service_loop] GLUSTER: Changelog history not
>> available, using xsync
>>
>> [2018-09-25 14:10:44.221385] I [master(worker
>> /bricks/brick1/brick):1623:crawl] _GMaster: starting hybrid crawl
>> stime=(0, 0)
>>
>> [2018-09-25 14:10:44.223382] I [gsyncdstatus(worker
>> /bricks/brick1/brick):249:set_worker_crawl_status] GeorepStatus: Crawl
>> Status Change      status=Hybrid Crawl
>>
>> [2018-09-25 14:10:46.225296] I [master(worker
>> /bricks/brick1/brick):1634:crawl] _GMaster: processing xsync changelog
>>
path=/var/lib/misc/gluster/gsyncd/glustervol1_slave_3_glustervol1/bricks-brick1-brick/xsync/XSYNC-CHANGELOG.1537884644
>>
>> [2018-09-25 14:13:36.157408] I [gsyncd(config-get):297:main]
<top>: Using
>> session config file
>>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf
>>
>> [2018-09-25 14:13:36.377880] I [gsyncd(status):297:main] <top>:
Using
>> session config file
>>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf
>>
>> [2018-09-25 14:31:10.145035] I [master(worker
>> /bricks/brick1/brick):1944:syncjob] Syncer: Sync Time Taken
>> duration=1212.5316      num_files=1474    job=2 return_code=11
>>
>> [2018-09-25 14:31:10.152637] E [syncdutils(worker
>> /bricks/brick1/brick):801:errlog] Popen: command returned error
cmd=rsync
>> -aR0 --inplace --files-from=- --super --stats --numeric-ids
>> --no-implied-dirs --xattrs --acls . -e ssh -oPasswordAuthentication=no
>> -oStrictHostKeyChecking=no -i
/var/lib/glusterd/geo-replication/secret.pem
>> -p 22 -oControlMaster=auto -S
>> /tmp/gsyncd-aux-ssh-gg758Z/caec4d1d03cc28bc1853f692e291164f.sock
>> slave_3:/proc/15919/cwd error=11
>>
>> [2018-09-25 14:31:10.237371] I [repce(agent
>> /bricks/brick1/brick):80:service_loop] RepceServer: terminating on
reaching
>> EOF.
>>
>> [2018-09-25 14:31:10.430820] I
>> [gsyncdstatus(monitor):244:set_worker_status] GeorepStatus: Worker
Status
>> Change  status=Faulty
>>
>> [2018-09-25 14:31:20.541475] I [monitor(monitor):158:monitor] Monitor:
>> starting gsyncd worker      brick=/bricks/brick1/brick   
slave_node=slave_3
>>
>> [2018-09-25 14:31:20.806518] I [gsyncd(agent
>> /bricks/brick1/brick):297:main] <top>: Using session config file
>>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf
>>
>> [2018-09-25 14:31:20.816536] I [changelogagent(agent
>> /bricks/brick1/brick):72:__init__] ChangelogAgent: Agent listining...
>>
>> [2018-09-25 14:31:20.821574] I [gsyncd(worker
>> /bricks/brick1/brick):297:main] <top>: Using session config file
>>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf
>>
>> [2018-09-25 14:31:20.882128] I [resource(worker
>> /bricks/brick1/brick):1377:connect_remote] SSH: Initializing SSH
connection
>> between master and slave...
>>
>> [2018-09-25 14:31:24.169857] I [resource(worker
>> /bricks/brick1/brick):1424:connect_remote] SSH: SSH connection between
>> master and slave established.      duration=3.2873
>>
>> [2018-09-25 14:31:24.170401] I [resource(worker
>> /bricks/brick1/brick):1096:connect] GLUSTER: Mounting gluster volume
>> locally...
>>
>> [2018-09-25 14:31:25.354633] I [resource(worker
>> /bricks/brick1/brick):1119:connect] GLUSTER: Mounted gluster volume
>> duration=1.1839
>>
>> [2018-09-25 14:31:25.355073] I [subcmds(worker
>> /bricks/brick1/brick):70:subcmd_worker] <top>: Worker spawn
successful.
>> Acknowledging back to monitor
>>
>> [2018-09-25 14:31:27.439034] I [master(worker
>> /bricks/brick1/brick):1593:register] _GMaster: Working dir
>>
path=/var/lib/misc/gluster/gsyncd/glustervol1_slave_3_glustervol1/bricks-brick1-brick
>>
>> [2018-09-25 14:31:27.441847] I [resource(worker
>> /bricks/brick1/brick):1282:service_loop] GLUSTER: Register time
>> time=1537885887
>>
>> [2018-09-25 14:31:27.465053] I [gsyncdstatus(worker
>> /bricks/brick1/brick):277:set_active] GeorepStatus: Worker Status
>> Change      status=Active
>>
>> [2018-09-25 14:31:27.471021] I [gsyncdstatus(worker
>> /bricks/brick1/brick):249:set_worker_crawl_status] GeorepStatus: Crawl
>> Status Change      status=History Crawl
>>
>> [2018-09-25 14:31:27.471484] I [master(worker
>> /bricks/brick1/brick):1507:crawl] _GMaster: starting history crawl
>> turns=1     stime=(0, 0)      entry_stime=None      etime=1537885887
>>
>> [2018-09-25 14:31:27.472564] E [repce(agent
>> /bricks/brick1/brick):105:worker] <top>: call failed:
>>
>> Traceback (most recent call last):
>>
>>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py",
line 101, in
>> worker
>>
>>     res = getattr(self.obj, rmeth)(*in_data[2:])
>>
>>   File
"/usr/libexec/glusterfs/python/syncdaemon/changelogagent.py", line
>> 53, in history
>>
>>     num_parallel)
>>
>>   File
"/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line
>> 104, in cl_history_changelog
>>
>>     raise ChangelogHistoryNotAvailable()
>>
>> ChangelogHistoryNotAvailable
>>
>> [2018-09-25 14:31:27.480632] E [repce(worker
>> /bricks/brick1/brick):197:__call__] RepceClient: call failed
>> call=31250:140272364406592:1537885887.47  method=history
>> error=ChangelogHistoryNotAvailable
>>
>> [2018-09-25 14:31:27.480958] I [resource(worker
>> /bricks/brick1/brick):1295:service_loop] GLUSTER: Changelog history not
>> available, using xsync
>>
>> [2018-09-25 14:31:27.495117] I [master(worker
>> /bricks/brick1/brick):1623:crawl] _GMaster: starting hybrid crawl
>> stime=(0, 0)
>>
>> [2018-09-25 14:31:27.502083] I [gsyncdstatus(worker
>> /bricks/brick1/brick):249:set_worker_crawl_status] GeorepStatus: Crawl
>> Status Change      status=Hybrid Crawl
>>
>> [2018-09-25 14:31:29.505284] I [master(worker
>> /bricks/brick1/brick):1634:crawl] _GMaster: processing xsync changelog
>>
path=/var/lib/misc/gluster/gsyncd/glustervol1_slave_3_glustervol1/bricks-brick1-brick/xsync/XSYNC-CHANGELOG.1537885887
>>
>>
>>
>> tail -f
>>
/var/log/glusterfs/geo-replication-slaves/glustervol1_slave_3_glustervol1/gsyncd.log
>>
>> [2018-09-25 13:49:24.141303] I [repce(slave
>> master/bricks/brick1/brick):80:service_loop] RepceServer: terminating
on
>> reaching EOF.
>>
>> [2018-09-25 13:49:36.602051] W [gsyncd(slave
>> master/bricks/brick1/brick):293:main] <top>: Session config file
not
>> exists, using the default config
>>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf
>>
>> [2018-09-25 13:49:36.629415] I [resource(slave
>> master/bricks/brick1/brick):1096:connect] GLUSTER: Mounting gluster
volume
>> locally...
>>
>> [2018-09-25 13:49:37.701642] I [resource(slave
>> master/bricks/brick1/brick):1119:connect] GLUSTER: Mounted gluster
>> volume     duration=1.0718
>>
>> [2018-09-25 13:49:37.704282] I [resource(slave
>> master/bricks/brick1/brick):1146:service_loop] GLUSTER: slave listening
>>
>> [2018-09-25 14:10:27.70952] I [repce(slave
>> master/bricks/brick1/brick):80:service_loop] RepceServer: terminating
on
>> reaching EOF.
>>
>> [2018-09-25 14:10:39.632124] W [gsyncd(slave
>> master/bricks/brick1/brick):293:main] <top>: Session config file
not
>> exists, using the default config
>>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf
>>
>> [2018-09-25 14:10:39.650958] I [resource(slave
>> master/bricks/brick1/brick):1096:connect] GLUSTER: Mounting gluster
volume
>> locally...
>>
>> [2018-09-25 14:10:40.729355] I [resource(slave
>> master/bricks/brick1/brick):1119:connect] GLUSTER: Mounted gluster
>> volume     duration=1.0781
>>
>> [2018-09-25 14:10:40.730650] I [resource(slave
>> master/bricks/brick1/brick):1146:service_loop] GLUSTER: slave listening
>>
>> [2018-09-25 14:31:10.291064] I [repce(slave
>> master/bricks/brick1/brick):80:service_loop] RepceServer: terminating
on
>> reaching EOF.
>>
>> [2018-09-25 14:31:22.802237] W [gsyncd(slave
>> master/bricks/brick1/brick):293:main] <top>: Session config file
not
>> exists, using the default config
>>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf
>>
>> [2018-09-25 14:31:22.828418] I [resource(slave
>> master/bricks/brick1/brick):1096:connect] GLUSTER: Mounting gluster
volume
>> locally...
>>
>> [2018-09-25 14:31:23.910206] I [resource(slave
>> master/bricks/brick1/brick):1119:connect] GLUSTER: Mounted gluster
>> volume     duration=1.0813
>>
>> [2018-09-25 14:31:23.913369] I [resource(slave
>> master/bricks/brick1/brick):1146:service_loop] GLUSTER: slave listening
>>
>>
>>
>> Any ideas how to resolve this without re-creating everything again? Can
I
>> reset the changelog history?
>>
>>
>>
>> Regards,
>>
>> Christian
>>
>>
>>
>> *From: *<gluster-users-bounces at gluster.org> on behalf of
"Kotte,
>> Christian (Ext)" <christian.kotte at novartis.com>
>> *Date: *Monday, 24. September 2018 at 17:20
>> *To: *Kotresh Hiremath Ravishankar <khiremat at redhat.com>
>> *Cc: *Gluster Users <gluster-users at gluster.org>
>> *Subject: *Re: [Gluster-users] [geo-rep] Replication faulty -
gsyncd.log
>> OSError: [Errno 13] Permission denied
>>
>>
>>
>> I don?t configure the permissions of /bricks/brick1/brick/.glusterfs. I
>> don?t even see it on the local GlusterFS mount.
>>
>>
>>
>> Not sure why the permissions are configured with S and the AD group?
>>
>>
>>
>> Regards,
>>
>> Christian
>>
>>
>>
>> *From: *<gluster-users-bounces at gluster.org> on behalf of
"Kotte,
>> Christian (Ext)" <christian.kotte at novartis.com>
>> *Date: *Monday, 24. September 2018 at 17:10
>> *To: *Kotresh Hiremath Ravishankar <khiremat at redhat.com>
>> *Cc: *Gluster Users <gluster-users at gluster.org>
>> *Subject: *Re: [Gluster-users] [geo-rep] Replication faulty -
gsyncd.log
>> OSError: [Errno 13] Permission denied
>>
>>
>>
>> Yeah right. I get permission denied.
>>
>>
>>
>> [geoaccount at slave ~]$ ll
>>
/bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e
>>
>> ls: cannot access
>>
/bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e:
>> Permission denied
>>
>> [geoaccount at slave ~]$ ll /bricks/brick1/brick/.glusterfs/29/d1/
>>
>> ls: cannot access /bricks/brick1/brick/.glusterfs/29/d1/: Permission
>> denied
>>
>> [geoaccount at slave ~]$ ll /bricks/brick1/brick/.glusterfs/29/
>>
>> ls: cannot access /bricks/brick1/brick/.glusterfs/29/: Permission
denied
>>
>> [geoaccount at slave ~]$ ll /bricks/brick1/brick/.glusterfs/
>>
>> ls: cannot open directory /bricks/brick1/brick/.glusterfs/: Permission
>> denied
>>
>>
>>
>> [root at slave ~]# ll /bricks/brick1/brick/.glusterfs/29
>>
>> total 0
>>
>> drwx--S---+ 2 root AD+group 50 Sep 10 07:29 16
>>
>> drwx--S---+ 2 root AD+group 50 Sep 10 07:29 33
>>
>> drwx--S---+ 2 root AD+group 50 Sep 10 07:29 5e
>>
>> drwx--S---+ 2 root AD+group 50 Sep 10 07:29 73
>>
>> drwx--S---+ 2 root AD+group 50 Sep 10 07:29 b2
>>
>> drwx--S---+ 2 root AD+group 50 Sep 21 09:39 d1
>>
>> drwx--S---+ 2 root AD+group 50 Sep 10 07:29 d7
>>
>> drwx--S---+ 2 root AD+group 50 Sep 10 07:29 e6
>>
>> drwx--S---+ 2 root AD+group 50 Sep 10 07:29 eb
>>
>> [root at slave ~]#
>>
>>
>>
>> However, the strange thing is that I could replicate new files and
>> folders before. The replication is broken since the ?New folder? was
>> created.
>>
>>
>>
>> These are the permissions on a dev/test system:
>>
>> [root at slave-dev ~]# ll /bricks/brick1/brick/.glusterfs/
>>
>> total 3136
>>
>> drwx------. 44 root root    4096 Aug 22 18:19 00
>>
>> drwx------. 50 root root    4096 Sep 12 13:14 01
>>
>> drwx------. 54 root root    4096 Sep 13 11:33 02
>>
>> drwx------. 59 root root    4096 Aug 22 18:21 03
>>
>> drwx------. 60 root root    4096 Sep 12 13:14 04
>>
>> drwx------. 68 root root    4096 Aug 24 12:36 05
>>
>> drwx------. 56 root root    4096 Aug 22 18:21 06
>>
>> drwx------. 46 root root    4096 Aug 22 18:21 07
>>
>> drwx------. 51 root root    4096 Aug 22 18:21 08
>>
>> drwx------. 42 root root    4096 Aug 22 18:21 09
>>
>> drwx------. 44 root root    4096 Sep 13 11:16 0a
>>
>>
>>
>> I?ve configured an AD group, SGID bit, and ACLs via Ansible on the
local
>> mount point. Could this be an issue? Should I avoid configuring the
>> permissions on .glusterfs and below?
>>
>>
>>
>> # ll /mnt/glustervol1/
>>
>> total 12
>>
>> drwxrwsr-x+  4 AD+user AD+group 4096 Jul 13 07:46 Scripts
>>
>> drwxrwxr-x+ 10 AD+user AD+group 4096 Jun 12 12:03 Software
>>
>> -rw-rw-r--+  1 root    AD+group    0 Aug  8 08:44 test
>>
>> drwxr-xr-x+  6 AD+user AD+group 4096 Apr 18 10:58 tftp
>>
>>
>>
>> glusterfs_volumes:
>>
>> [?]
>>
>>     permissions:
>>
>>       mode: "02775"
>>
>>       owner: root
>>
>>       group: "AD+group"
>>
>>       acl_permissions: rw
>>
>> [?]
>>
>>
>>
>> # root directory is owned by root.
>>
>> # set permissions to 'g+s' to automatically set the group to
"AD+group"
>>
>> # permissions of individual files will be set by Samba during creation
>>
>> - name: Configure volume directory permission 1/2
>>
>>   tags: glusterfs
>>
>>   file:
>>
>>     path: /mnt/{{ item.volume }}
>>
>>     state: directory
>>
>>     mode: "{{ item.permissions.mode }}"
>>
>>     owner: "{{ item.permissions.owner }}"
>>
>>     group: "{{ item.permissions.group }}"
>>
>>   with_items: "{{ glusterfs_volumes }}"
>>
>>   loop_control:
>>
>>     label: "{{ item.volume }}"
>>
>>   when: item.permissions is defined
>>
>>
>>
>> # ACL needs to be set to override default umask and grant
"AD+group"
>> write permissions
>>
>> - name: Configure volume directory permission 2/2 (ACL)
>>
>>   tags: glusterfs
>>
>>   acl:
>>
>>     path: /mnt/{{ item.volume }}
>>
>>     default: yes
>>
>>     entity: "{{ item.permissions.group }}"
>>
>>     etype: group
>>
>>     permissions: "{{ item.permissions.acl_permissions }}"
>>
>>     state: present
>>
>>   with_items: "{{ glusterfs_volumes }}"
>>
>>   loop_control:
>>
>>     label: "{{ item.volume }}"
>>
>>   when: item.permissions is defined
>>
>>
>>
>> Regards,
>>
>> Christian
>>
>>
>>
>> *From: *Kotresh Hiremath Ravishankar <khiremat at redhat.com>
>> *Date: *Monday, 24. September 2018 at 16:20
>> *To: *"Kotte, Christian (Ext)" <christian.kotte at
novartis.com>
>> *Cc: *Gluster Users <gluster-users at gluster.org>
>> *Subject: *Re: [Gluster-users] [geo-rep] Replication faulty -
gsyncd.log
>> OSError: [Errno 13] Permission denied
>>
>>
>>
>> I think I am get what's happening. The geo-rep session is non-root.
>>
>> Could you do readlink on brick path mentioned above
>>
/bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e
>>
>> from a geaccount user and see if you are getting "Permission
Denied"
>> errors?
>>
>>
>>
>> Thanks,
>>
>> Kotresh HR
>>
>>
>>
>> On Mon, Sep 24, 2018 at 7:35 PM Kotte, Christian (Ext) <
>> christian.kotte at novartis.com> wrote:
>>
>> Ok. It happens on all slave nodes (and on the interimmaster as well).
>>
>>
>>
>> It?s like I assumed. These are the logs of one of the slaves:
>>
>>
>>
>> gsyncd.log
>>
>> [2018-09-24 13:52:25.418382] I [repce(slave
>> slave/bricks/brick1/brick):80:service_loop] RepceServer: terminating on
>> reaching EOF.
>>
>> [2018-09-24 13:52:37.95297] W [gsyncd(slave
>> slave/bricks/brick1/brick):293:main] <top>: Session config file
not exists,
>> using the default config
>>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_glustervol1/gsyncd.conf
>>
>> [2018-09-24 13:52:37.109643] I [resource(slave
>> slave/bricks/brick1/brick):1096:connect] GLUSTER: Mounting gluster
volume
>> locally...
>>
>> [2018-09-24 13:52:38.303920] I [resource(slave
>> slave/bricks/brick1/brick):1119:connect] GLUSTER: Mounted gluster
>> volume      duration=1.1941
>>
>> [2018-09-24 13:52:38.304771] I [resource(slave
>> slave/bricks/brick1/brick):1146:service_loop] GLUSTER: slave listening
>>
>> [2018-09-24 13:52:41.981554] I [resource(slave
>> slave/bricks/brick1/brick):598:entry_ops] <top>: Special case:
rename on
>> mkdir        gfid=29d1d60d-1ad6-45fc-87e0-93d478f7331e
>> entry='.gfid/6b97b987-8aef-46c3-af27-20d3aa883016/New folder'
>>
>> [2018-09-24 13:52:42.45641] E [repce(slave
>> slave/bricks/brick1/brick):105:worker] <top>: call failed:
>>
>> Traceback (most recent call last):
>>
>>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py",
line 101, in
>> worker
>>
>>     res = getattr(self.obj, rmeth)(*in_data[2:])
>>
>>   File
"/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 599,
>> in entry_ops
>>
>>     src_entry = get_slv_dir_path(slv_host, slv_volume, gfid)
>>
>>   File
"/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line
>> 682, in get_slv_dir_path
>>
>>     [ENOENT], [ESTALE])
>>
>>   File
"/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line
>> 540, in errno_wrap
>>
>>     return call(*arg)
>>
>> OSError: [Errno 13] Permission denied:
>>
'/bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e'
>>
>> [2018-09-24 13:52:42.81794] I [repce(slave
>> slave/bricks/brick1/brick):80:service_loop] RepceServer: terminating on
>> reaching EOF.
>>
>> [2018-09-24 13:52:53.459676] W [gsyncd(slave
>> slave/bricks/brick1/brick):293:main] <top>: Session config file
not exists,
>> using the default config
>>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_glustervol1/gsyncd.conf
>>
>> [2018-09-24 13:52:53.473500] I [resource(slave
>> slave/bricks/brick1/brick):1096:connect] GLUSTER: Mounting gluster
volume
>> locally...
>>
>> [2018-09-24 13:52:54.659044] I [resource(slave
>> slave/bricks/brick1/brick):1119:connect] GLUSTER: Mounted gluster
volume
>>    duration=1.1854
>>
>> [2018-09-24 13:52:54.659837] I [resource(slave
>> slave/bricks/brick1/brick):1146:service_loop] GLUSTER: slave listening
>>
>>
>>
>> The folder ?New folder? will be created via Samba and it was renamed by
>> my colleague right away after creation.
>>
>> [root at slave glustervol1_slave_glustervol1]# ls
>>
/bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e/
>>
>> [root at slave glustervol1_slave_glustervol1]# ls
>> /bricks/brick1/brick/.glusterfs/29/d1/ -al
>>
>> total 0
>>
>> drwx--S---+  2 root AD+group 50 Sep 21 09:39 .
>>
>> drwx--S---+ 11 root AD+group 96 Sep 21 09:39 ..
>>
>> lrwxrwxrwx.  1 root AD+group 75 Sep 21 09:39
>> 29d1d60d-1ad6-45fc-87e0-93d478f7331e ->
>> ../../6b/97/6b97b987-8aef-46c3-af27-20d3aa883016/vRealize Operation
Manager
>>
>>
>>
>> Creating the folder in
>>
/bricks/brick1/brick/.glusterfs/6b/97/6b97b987-8aef-46c3-af27-20d3aa883016/,
>> but it didn?t change anything.
>>
>>
>>
>> mnt-slave-bricks-brick1-brick.log
>>
>> [2018-09-24 13:51:10.625723] W [rpc-clnt.c:1753:rpc_clnt_submit]
>> 0-glustervol1-client-0: error returned while attempting to connect to
>> host:(null), port:0
>>
>> [2018-09-24 13:51:10.626092] W [rpc-clnt.c:1753:rpc_clnt_submit]
>> 0-glustervol1-client-0: error returned while attempting to connect to
>> host:(null), port:0
>>
>> [2018-09-24 13:51:10.626181] I [rpc-clnt.c:2105:rpc_clnt_reconfig]
>> 0-glustervol1-client-0: changing port to 49152 (from 0)
>>
>> [2018-09-24 13:51:10.643111] W [rpc-clnt.c:1753:rpc_clnt_submit]
>> 0-glustervol1-client-0: error returned while attempting to connect to
>> host:(null), port:0
>>
>> [2018-09-24 13:51:10.643489] W [dict.c:923:str_to_data]
>> (-->/usr/lib64/glusterfs/4.1.3/xlator/protocol/client.so(+0x4131a)
>> [0x7fafb023831a] -->/lib64/libglusterfs.so.0(dict_set_str+0x16)
>> [0x7fafbdb83266] -->/lib64/libglusterfs.so.0(str_to_data+0x91)
>> [0x7fafbdb7fea1] ) 0-dict: value is NULL [Invalid argument]
>>
>> [2018-09-24 13:51:10.643507] I [MSGID: 114006]
>> [client-handshake.c:1308:client_setvolume] 0-glustervol1-client-0:
failed
>> to set process-name in handshake msg
>>
>> [2018-09-24 13:51:10.643541] W [rpc-clnt.c:1753:rpc_clnt_submit]
>> 0-glustervol1-client-0: error returned while attempting to connect to
>> host:(null), port:0
>>
>> [2018-09-24 13:51:10.671460] I [MSGID: 114046]
>> [client-handshake.c:1176:client_setvolume_cbk] 0-glustervol1-client-0:
>> Connected to glustervol1-client-0, attached to remote volume
>> '/bricks/brick1/brick'.
>>
>> [2018-09-24 13:51:10.672694] I [fuse-bridge.c:4294:fuse_init]
>> 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24
kernel
>> 7.22
>>
>> [2018-09-24 13:51:10.672715] I [fuse-bridge.c:4927:fuse_graph_sync]
>> 0-fuse: switched to graph 0
>>
>> [2018-09-24 13:51:10.673329] I [MSGID: 109005]
>> [dht-selfheal.c:2342:dht_selfheal_directory] 0-glustervol1-dht:
Directory
>> selfheal failed: Unable to form layout for directory /
>>
>> [2018-09-24 13:51:16.116458] I [fuse-bridge.c:5199:fuse_thread_proc]
>> 0-fuse: initating unmount of
>> /var/mountbroker-root/user1300/mtpt-geoaccount-ARDW1E
>>
>> [2018-09-24 13:51:16.116595] W [glusterfsd.c:1514:cleanup_and_exit]
>> (-->/lib64/libpthread.so.0(+0x7e25) [0x7fafbc9eee25]
>> -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x55d5dac5dd65]
>> -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x55d5dac5db8b] ) 0-:
>> received signum (15), shutting down
>>
>> [2018-09-24 13:51:16.116616] I [fuse-bridge.c:5981:fini] 0-fuse:
>> Unmounting
'/var/mountbroker-root/user1300/mtpt-geoaccount-ARDW1E'.
>>
>> [2018-09-24 13:51:16.116625] I [fuse-bridge.c:5986:fini] 0-fuse:
Closing
>> fuse connection to
'/var/mountbroker-root/user1300/mtpt-geoaccount-ARDW1E'.
>>
>>
>>
>> Regards,
>>
>> Christian
>>
>>
>>
>> *From: *Kotresh Hiremath Ravishankar <khiremat at redhat.com>
>> *Date: *Saturday, 22. September 2018 at 06:52
>> *To: *"Kotte, Christian (Ext)" <christian.kotte at
novartis.com>
>> *Cc: *Gluster Users <gluster-users at gluster.org>
>> *Subject: *Re: [Gluster-users] [geo-rep] Replication faulty -
gsyncd.log
>> OSError: [Errno 13] Permission denied
>>
>>
>>
>> The problem occured on slave side whose error is propagated to master.
>> Mostly any traceback with repce involved is related to problem in
slave.
>> Just check few lines above in the log to find the slave node, the
crashed
>> worker is connected to and get geo replication logs to further debug.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Fri, 21 Sep 2018, 20:10 Kotte, Christian (Ext), <
>> christian.kotte at novartis.com> wrote:
>>
>> Hi,
>>
>>
>>
>> Any idea how to troubleshoot this?
>>
>>
>>
>> New folders and files were created on the master and the replication
went
>> faulty. They were created via Samba.
>>
>>
>>
>> Version: GlusterFS 4.1.3
>>
>>
>>
>> [root at master]# gluster volume geo-replication status
>>
>>
>>
>> MASTER NODE                         MASTER VOL     MASTER
>> BRICK            SLAVE USER
>> SLAVE                                                             SLAVE
>> NODE    STATUS    CRAWL STATUS    LAST_SYNCED
>>
>>
>>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>> master    glustervol1    /bricks/brick1/brick    geoaccount
>> ssh://geoaccount at slave_1::glustervol1       N/A           Faulty
>> N/A             N/A
>>
>> master    glustervol1    /bricks/brick1/brick    geoaccount
>> ssh://geoaccount at slave_2::glustervol1       N/A           Faulty
>> N/A             N/A
>>
>> master    glustervol1    /bricks/brick1/brick    geoaccount
>> ssh://geoaccount at interimmaster::glustervol1   N/A           Faulty
>> N/A             N/A
>>
>>
>>
>> The following error is repeatedly logged in the gsyncd.logs:
>>
>> [2018-09-21 14:26:38.611479] I [repce(agent
>> /bricks/brick1/brick):80:service_loop] RepceServer: terminating on
reaching
>> EOF.
>>
>> [2018-09-21 14:26:39.211527] I [monitor(monitor):279:monitor] Monitor:
>> worker died in startup phase     brick=/bricks/brick1/brick
>>
>> [2018-09-21 14:26:39.214322] I
>> [gsyncdstatus(monitor):244:set_worker_status] GeorepStatus: Worker
Status
>> Change status=Faulty
>>
>> [2018-09-21 14:26:49.318953] I [monitor(monitor):158:monitor] Monitor:
>> starting gsyncd worker   brick=/bricks/brick1/brick     
slave_node>> slave_3 <http://nrchbs-slp2020.nibr.novartis.net>
>>
>> [2018-09-21 14:26:49.471532] I [gsyncd(agent
>> /bricks/brick1/brick):297:main] <top>: Using session config file
>>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf
>>
>> [2018-09-21 14:26:49.473917] I [changelogagent(agent
>> /bricks/brick1/brick):72:__init__] ChangelogAgent: Agent listining...
>>
>> [2018-09-21 14:26:49.491359] I [gsyncd(worker
>> /bricks/brick1/brick):297:main] <top>: Using session config file
>>
path=/var/lib/glusterd/geo-replication/glustervol1_slave_3_glustervol1/gsyncd.conf
>>
>> [2018-09-21 14:26:49.538049] I [resource(worker
>> /bricks/brick1/brick):1377:connect_remote] SSH: Initializing SSH
connection
>> between master and slave...
>>
>> [2018-09-21 14:26:53.5017] I [resource(worker
>> /bricks/brick1/brick):1424:connect_remote] SSH: SSH connection between
>> master and slave established.      duration=3.4665
>>
>> [2018-09-21 14:26:53.5419] I [resource(worker
>> /bricks/brick1/brick):1096:connect] GLUSTER: Mounting gluster volume
>> locally...
>>
>> [2018-09-21 14:26:54.120374] I [resource(worker
>> /bricks/brick1/brick):1119:connect] GLUSTER: Mounted gluster volume
>> duration=1.1146
>>
>> [2018-09-21 14:26:54.121012] I [subcmds(worker
>> /bricks/brick1/brick):70:subcmd_worker] <top>: Worker spawn
successful.
>> Acknowledging back to monitor
>>
>> [2018-09-21 14:26:56.144460] I [master(worker
>> /bricks/brick1/brick):1593:register] _GMaster: Working dir
>>
path=/var/lib/misc/gluster/gsyncd/glustervol1_slave_3_glustervol1/bricks-brick1-brick
>>
>> [2018-09-21 14:26:56.145145] I [resource(worker
>> /bricks/brick1/brick):1282:service_loop] GLUSTER: Register time
>> time=1537540016
>>
>> [2018-09-21 14:26:56.160064] I [gsyncdstatus(worker
>> /bricks/brick1/brick):277:set_active] GeorepStatus: Worker Status
Change
>> status=Active
>>
>> [2018-09-21 14:26:56.161175] I [gsyncdstatus(worker
>> /bricks/brick1/brick):249:set_worker_crawl_status] GeorepStatus: Crawl
>> Status Change        status=History Crawl
>>
>> [2018-09-21 14:26:56.161536] I [master(worker
>> /bricks/brick1/brick):1507:crawl] _GMaster: starting history crawl
>> turns=1 stime=(1537522637, 0)   entry_stime=(1537537141, 0)
>> etime=1537540016
>>
>> [2018-09-21 14:26:56.164277] I [master(worker
>> /bricks/brick1/brick):1536:crawl] _GMaster: slave's time
>> stime=(1537522637, 0)
>>
>> [2018-09-21 14:26:56.197065] I [master(worker
>> /bricks/brick1/brick):1360:process] _GMaster: Skipping already
processed
>> entry ops        to_changelog=1537522638 num_changelogs=1
>> from_changelog=1537522638
>>
>> [2018-09-21 14:26:56.197402] I [master(worker
>> /bricks/brick1/brick):1374:process] _GMaster: Entry Time Taken    MKD=0
>> MKN=0   LIN=0   SYM=0   REN=0   RMD=0   CRE=0   duration=0.0000 UNL=1
>>
>> [2018-09-21 14:26:56.197623] I [master(worker
>> /bricks/brick1/brick):1384:process] _GMaster: Data/Metadata Time Taken
>> SETA=0  SETX=0  meta_duration=0.0000    data_duration=0.0284    DATA=0
>> XATT=0
>>
>> [2018-09-21 14:26:56.198230] I [master(worker
>> /bricks/brick1/brick):1394:process] _GMaster: Batch Completed
>> changelog_end=1537522638        entry_stime=(1537537141, 0)
>> changelog_start=1537522638      stime=(1537522637, 0)   duration=0.0333
>> num_changelogs=1        mode=history_changelog
>>
>> [2018-09-21 14:26:57.200436] I [master(worker
>> /bricks/brick1/brick):1536:crawl] _GMaster: slave's time
>> stime=(1537522637, 0)
>>
>> [2018-09-21 14:26:57.528625] E [repce(worker
>> /bricks/brick1/brick):197:__call__] RepceClient: call failed
>> call=17209:140650361157440:1537540017.21        method=entry_ops
>> error=OSError
>>
>> [2018-09-21 14:26:57.529371] E [syncdutils(worker
>> /bricks/brick1/brick):332:log_raise_exception] <top>: FAIL:
>>
>> Traceback (most recent call last):
>>
>>   File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py",
line 311, in
>> main
>>
>>     func(args)
>>
>>   File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py",
line 72, in
>> subcmd_worker
>>
>>     local.service_loop(remote)
>>
>>   File
"/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1288,
>> in service_loop
>>
>>     g3.crawlwrap(oneshot=True)
>>
>>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py",
line 615, in
>> crawlwrap
>>
>>     self.crawl()
>>
>>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py",
line 1545,
>> in crawl
>>
>>     self.changelogs_batch_process(changes)
>>
>>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py",
line 1445,
>> in changelogs_batch_process
>>
>>     self.process(batch)
>>
>>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py",
line 1280,
>> in process
>>
>>     self.process_change(change, done, retry)
>>
>>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py",
line 1179,
>> in process_change
>>
>>     failures = self.slave.server.entry_ops(entries)
>>
>>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py",
line 216, in
>> __call__
>>
>>     return self.ins(self.meth, *a)
>>
>>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py",
line 198, in
>> __call__
>>
>>     raise res
>>
>> OSError: [Errno 13] Permission denied:
>>
'/bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e'
>>
>>
>>
>> The permissions look fine. The replication is done via geo user instead
>> of root. It should be able to read, but I?m not sure if the syncdaemon
runs
>> under geoaccount!?
>>
>>
>>
>> [root at master vRealize Operation Manager]# ll
>>
/bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e
>>
>> lrwxrwxrwx. 1 root root 75 Sep 21 09:39
>>
/bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e
>> -> ../../6b/97/6b97b987-8aef-46c3-af27-20d3aa883016/vRealize
Operation
>> Manager
>>
>>
>>
>> [root at master vRealize Operation Manager]# ll
>>
/bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e/
>>
>> total 4
>>
>> drwxrwxr-x. 2 AD+user AD+group  131 Sep 21 10:14 6.7
>>
>> drwxrwxr-x. 2 AD+user AD+group 4096 Sep 21 09:43 7.0
>>
>> drwxrwxr-x. 2 AD+user AD+group   57 Sep 21 10:28 7.5
>>
>> [root at master vRealize Operation Manager]#
>>
>>
>>
>> It could be possible that the folder was renamed. I had 3 similar
issues
>> since I migrated to GlusterFS 4.x but couldn?t investigate much. I
needed
>> to completely wipe GlusterFS and geo-repliction to get rid of this
error?
>>
>>
>>
>> Any help is appreciated.
>>
>>
>>
>> Regards,
>>
>>
>>
>> *Christian Kotte*
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
<https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.gluster.org_mailman_listinfo_gluster-2Dusers&d=DwMFaQ&c=ZbgFmJjg4pdtrnL2HUJUDw&r=faVOd9yfnSYhe2mQhqlDwcpXGm7x8HN1C9wPmFD3694&m=buld78OSs9O-NEZ-w9vywUcr-bP6_RTbL2pwat-zRIU&s=bKc1d7zoIXuVSLbZS_vD3v4-FJrG2I6T6Dhcq8Qk6Bs&e=>
>>
>>
>>
>> --
>>
>> Thanks and Regards,
>>
>> Kotresh H R
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users


-- 
Thanks and Regards,
Kotresh H R
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190320/6aa8a6f4/attachment.html>

Gluster users - Mar 2019 - [geo-rep] Replication faulty - gsyncd.log OSError: [Errno 13] Permission denied

[Gluster-users] [geo-rep] Replication faulty - gsyncd.log OSError: [Errno 13] Permission denied

[Gluster-users] [geo-rep] Replication faulty - gsyncd.log OSError: [Errno 13] Permission denied