Kotresh Hiremath Ravishankar
2019-Nov-19 05:13 UTC
[Gluster-users] Geo_replication to Faulty
This issue has been recently fixed with the following patch and should be available in latest gluster-6.x https://review.gluster.org/#/c/glusterfs/+/23570/ On Tue, Nov 19, 2019 at 10:26 AM deepu srinivasan <sdeepugd at gmail.com> wrote:> > Hi Aravinda > *The below logs are from master end:* > > [2019-11-16 17:29:43.536881] I [gsyncdstatus(worker > /home/sas/gluster/data/code-misc6):281:set_active] GeorepStatus: Worker > Status Change status=Active > [2019-11-16 17:29:43.629620] I [gsyncdstatus(worker > /home/sas/gluster/data/code-misc6):253:set_worker_crawl_status] > GeorepStatus: Crawl Status Change status=History Crawl > [2019-11-16 17:29:43.630328] I [master(worker > /home/sas/gluster/data/code-misc6):1517:crawl] _GMaster: starting history > crawl turns=1 stime=(1573924576, 0) entry_stime=(1573924576, 0) > etime=1573925383 > [2019-11-16 17:29:44.636725] I [master(worker > /home/sas/gluster/data/code-misc6):1546:crawl] _GMaster: slave's time > stime=(1573924576, 0) > [2019-11-16 17:29:44.778966] I [master(worker > /home/sas/gluster/data/code-misc6):898:fix_possible_entry_failures] > _GMaster: Fixing ENOENT error in slave. Parent does not exist on master. > Safe to ignore, take out entry retry_count=1 entry=({'uid': 0, > 'gfid': 'c02519e0-0ead-4fe8-902b-dcae72ef83a3', 'gid': 0, 'mode': 33188, > 'entry': '.gfid/d60aa0d5-4fdf-4721-97dc-9e3e50995dab/368307802', 'op': > 'CREATE'}, 2, {'slave_isdir': False, 'gfid_mismatch': False, 'slave_name': > None, 'slave_gfid': None, 'name_mismatch': False, 'dst': False}) > [2019-11-16 17:29:44.779306] I [master(worker > /home/sas/gluster/data/code-misc6):942:handle_entry_failures] _GMaster: > Sucessfully fixed entry ops with gfid mismatch retry_count=1 > [2019-11-16 17:29:44.779516] I [master(worker > /home/sas/gluster/data/code-misc6):1194:process_change] _GMaster: Retry > original entries. count = 1 > [2019-11-16 17:29:44.879321] E [repce(worker > /home/sas/gluster/data/code-misc6):214:__call__] RepceClient: call failed > call=151945:140353273153344:1573925384.78 method=entry_ops > error=OSError > [2019-11-16 17:29:44.879750] E [syncdutils(worker > /home/sas/gluster/data/code-misc6):338:log_raise_exception] <top>: FAIL: > Traceback (most recent call last): > File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 322, in > main > func(args) > File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 82, in > subcmd_worker > local.service_loop(remote) > File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1277, > in service_loop > g3.crawlwrap(oneshot=True) > File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 599, in > crawlwrap > self.crawl() > File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1555, in > crawl > self.changelogs_batch_process(changes) > File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1455, in > changelogs_batch_process > self.process(batch) > File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1290, in > process > self.process_change(change, done, retry) > File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1195, in > process_change > failures = self.slave.server.entry_ops(entries) > File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 233, in > __call__ > return self.ins(self.meth, *a) > File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 215, in > __call__ > raise res > OSError: [Errno 13] Permission denied: > '/home/sas/gluster/data/code-misc6/.glusterfs/6a/90/6a9008b1-a4aa-4c30-9ae7-92a33e05d0bb' > [2019-11-16 17:29:44.911767] I [repce(agent > /home/sas/gluster/data/code-misc6):97:service_loop] RepceServer: > terminating on reaching EOF. > [2019-11-16 17:29:45.509344] I [monitor(monitor):278:monitor] Monitor: > worker died in startup phase brick=/home/sas/gluster/data/code-misc6 > [2019-11-16 17:29:45.511806] I > [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status > Change status=Faulty > > > > *The below logs are from the slave end.* > > [2019-11-16 17:24:42.281599] I [resource(slave > 192.168.185.106/home/sas/gluster/data/code-misc6):580:entry_ops > <http://192.168.185.106/home/sas/gluster/data/code-misc6%29:580:entry_ops>] > <top>: Special case: rename on mkdir > gfid=6a9008b1-a4aa-4c30-9ae7-92a33e05d0bb > entry='.gfid/a8921d78-a078-46d3-aca5-8b078eb62cac/8878061b-d5b3-47a6-b01c-8310fee39b20' > [2019-11-16 17:24:42.370582] E [repce(slave > 192.168.185.106/home/sas/gluster/data/code-misc6):122:worker > <http://192.168.185.106/home/sas/gluster/data/code-misc6%29:122:worker>] > <top>: call failed: > Traceback (most recent call last): > File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 118, in > worker > res = getattr(self.obj, rmeth)(*in_data[2:]) > File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 581, > in entry_ops > src_entry = get_slv_dir_path(slv_host, slv_volume, gfid) > File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 690, > in get_slv_dir_path > [ENOENT], [ESTALE]) > File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 546, > in errno_wrap > return call(*arg) > OSError: [Errno 13] Permission denied: > '/home/sas/gluster/data/code-misc6/.glusterfs/6a/90/6a9008b1-a4aa-4c30-9ae7-92a33e05d0bb' > [2019-11-16 17:24:42.400402] I [repce(slave > 192.168.185.106/home/sas/gluster/data/code-misc6):97:service_loop > <http://192.168.185.106/home/sas/gluster/data/code-misc6%29:97:service_loop>] > RepceServer: terminating on reaching EOF. > [2019-11-16 17:24:53.403165] W [gsyncd(slave > 192.168.185.106/home/sas/gluster/data/code-misc6):304:main > <http://192.168.185.106/home/sas/gluster/data/code-misc6%29:304:main>] > <top>: Session config file not exists, using the default config > path=/var/lib/glusterd/geo-replication/code-misc_192.168.185.107_code-misc/gsyncd.con > > > On Sat, Nov 16, 2019, 9:26 PM Aravinda Vishwanathapura Krishna Murthy < > avishwan at redhat.com> wrote: > >> Hi Deepu, >> >> Please share the reason for Faulty from Geo-rep logs of respective >> master node. >> >> >> On Sat, Nov 16, 2019 at 1:01 AM deepu srinivasan <sdeepugd at gmail.com> >> wrote: >> >>> Hi Users/Development Team >>> We have set up a Geo-replication session with non-root in slave setup in >>> our DC. >>> It was working well with Active Status and Changelogcrawl. >>> >>> We were mounting the master node and the file is being written in it. >>> We were running some process as the root user so the process wrote some >>> file and folder with root permission. >>> After stopping the geo-replication and starting the process the session >>> went to the faulty state. >>> How to recover? >>> >> >> >> -- >> regards >> Aravinda VK >> >-- Thanks and Regards, Kotresh H R -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20191119/ac959e53/attachment.html>