Kotresh Hiremath Ravishankar
2019-Nov-19 05:51 UTC
[Gluster-users] Geo_replication to Faulty
Which version of gluster are you using? On Tue, Nov 19, 2019 at 11:00 AM deepu srinivasan <sdeepugd at gmail.com> wrote:> Hi kotresh > Is there a stable release in 6.x series? > > > On Tue, Nov 19, 2019, 10:44 AM Kotresh Hiremath Ravishankar < > khiremat at redhat.com> wrote: > >> This issue has been recently fixed with the following patch and should be >> available in latest gluster-6.x >> >> https://review.gluster.org/#/c/glusterfs/+/23570/ >> >> On Tue, Nov 19, 2019 at 10:26 AM deepu srinivasan <sdeepugd at gmail.com> >> wrote: >> >>> >>> Hi Aravinda >>> *The below logs are from master end:* >>> >>> [2019-11-16 17:29:43.536881] I [gsyncdstatus(worker >>> /home/sas/gluster/data/code-misc6):281:set_active] GeorepStatus: Worker >>> Status Change status=Active >>> [2019-11-16 17:29:43.629620] I [gsyncdstatus(worker >>> /home/sas/gluster/data/code-misc6):253:set_worker_crawl_status] >>> GeorepStatus: Crawl Status Change status=History Crawl >>> [2019-11-16 17:29:43.630328] I [master(worker >>> /home/sas/gluster/data/code-misc6):1517:crawl] _GMaster: starting history >>> crawl turns=1 stime=(1573924576, 0) entry_stime=(1573924576, 0) >>> etime=1573925383 >>> [2019-11-16 17:29:44.636725] I [master(worker >>> /home/sas/gluster/data/code-misc6):1546:crawl] _GMaster: slave's time >>> stime=(1573924576, 0) >>> [2019-11-16 17:29:44.778966] I [master(worker >>> /home/sas/gluster/data/code-misc6):898:fix_possible_entry_failures] >>> _GMaster: Fixing ENOENT error in slave. Parent does not exist on master. >>> Safe to ignore, take out entry retry_count=1 entry=({'uid': 0, >>> 'gfid': 'c02519e0-0ead-4fe8-902b-dcae72ef83a3', 'gid': 0, 'mode': 33188, >>> 'entry': '.gfid/d60aa0d5-4fdf-4721-97dc-9e3e50995dab/368307802', 'op': >>> 'CREATE'}, 2, {'slave_isdir': False, 'gfid_mismatch': False, 'slave_name': >>> None, 'slave_gfid': None, 'name_mismatch': False, 'dst': False}) >>> [2019-11-16 17:29:44.779306] I [master(worker >>> /home/sas/gluster/data/code-misc6):942:handle_entry_failures] _GMaster: >>> Sucessfully fixed entry ops with gfid mismatch retry_count=1 >>> [2019-11-16 17:29:44.779516] I [master(worker >>> /home/sas/gluster/data/code-misc6):1194:process_change] _GMaster: Retry >>> original entries. count = 1 >>> [2019-11-16 17:29:44.879321] E [repce(worker >>> /home/sas/gluster/data/code-misc6):214:__call__] RepceClient: call failed >>> call=151945:140353273153344:1573925384.78 method=entry_ops >>> error=OSError >>> [2019-11-16 17:29:44.879750] E [syncdutils(worker >>> /home/sas/gluster/data/code-misc6):338:log_raise_exception] <top>: FAIL: >>> Traceback (most recent call last): >>> File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 322, >>> in main >>> func(args) >>> File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 82, >>> in subcmd_worker >>> local.service_loop(remote) >>> File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line >>> 1277, in service_loop >>> g3.crawlwrap(oneshot=True) >>> File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 599, >>> in crawlwrap >>> self.crawl() >>> File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1555, >>> in crawl >>> self.changelogs_batch_process(changes) >>> File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1455, >>> in changelogs_batch_process >>> self.process(batch) >>> File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1290, >>> in process >>> self.process_change(change, done, retry) >>> File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1195, >>> in process_change >>> failures = self.slave.server.entry_ops(entries) >>> File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 233, in >>> __call__ >>> return self.ins(self.meth, *a) >>> File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 215, in >>> __call__ >>> raise res >>> OSError: [Errno 13] Permission denied: >>> '/home/sas/gluster/data/code-misc6/.glusterfs/6a/90/6a9008b1-a4aa-4c30-9ae7-92a33e05d0bb' >>> [2019-11-16 17:29:44.911767] I [repce(agent >>> /home/sas/gluster/data/code-misc6):97:service_loop] RepceServer: >>> terminating on reaching EOF. >>> [2019-11-16 17:29:45.509344] I [monitor(monitor):278:monitor] Monitor: >>> worker died in startup phase brick=/home/sas/gluster/data/code-misc6 >>> [2019-11-16 17:29:45.511806] I >>> [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status >>> Change status=Faulty >>> >>> >>> >>> *The below logs are from the slave end.* >>> >>> [2019-11-16 17:24:42.281599] I [resource(slave >>> 192.168.185.106/home/sas/gluster/data/code-misc6):580:entry_ops >>> <http://192.168.185.106/home/sas/gluster/data/code-misc6%29:580:entry_ops>] >>> <top>: Special case: rename on mkdir >>> gfid=6a9008b1-a4aa-4c30-9ae7-92a33e05d0bb >>> entry='.gfid/a8921d78-a078-46d3-aca5-8b078eb62cac/8878061b-d5b3-47a6-b01c-8310fee39b20' >>> [2019-11-16 17:24:42.370582] E [repce(slave >>> 192.168.185.106/home/sas/gluster/data/code-misc6):122:worker >>> <http://192.168.185.106/home/sas/gluster/data/code-misc6%29:122:worker>] >>> <top>: call failed: >>> Traceback (most recent call last): >>> File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 118, in >>> worker >>> res = getattr(self.obj, rmeth)(*in_data[2:]) >>> File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 581, >>> in entry_ops >>> src_entry = get_slv_dir_path(slv_host, slv_volume, gfid) >>> File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line >>> 690, in get_slv_dir_path >>> [ENOENT], [ESTALE]) >>> File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line >>> 546, in errno_wrap >>> return call(*arg) >>> OSError: [Errno 13] Permission denied: >>> '/home/sas/gluster/data/code-misc6/.glusterfs/6a/90/6a9008b1-a4aa-4c30-9ae7-92a33e05d0bb' >>> [2019-11-16 17:24:42.400402] I [repce(slave >>> 192.168.185.106/home/sas/gluster/data/code-misc6):97:service_loop >>> <http://192.168.185.106/home/sas/gluster/data/code-misc6%29:97:service_loop>] >>> RepceServer: terminating on reaching EOF. >>> [2019-11-16 17:24:53.403165] W [gsyncd(slave >>> 192.168.185.106/home/sas/gluster/data/code-misc6):304:main >>> <http://192.168.185.106/home/sas/gluster/data/code-misc6%29:304:main>] >>> <top>: Session config file not exists, using the default config >>> path=/var/lib/glusterd/geo-replication/code-misc_192.168.185.107_code-misc/gsyncd.con >>> >>> >>> On Sat, Nov 16, 2019, 9:26 PM Aravinda Vishwanathapura Krishna Murthy < >>> avishwan at redhat.com> wrote: >>> >>>> Hi Deepu, >>>> >>>> Please share the reason for Faulty from Geo-rep logs of respective >>>> master node. >>>> >>>> >>>> On Sat, Nov 16, 2019 at 1:01 AM deepu srinivasan <sdeepugd at gmail.com> >>>> wrote: >>>> >>>>> Hi Users/Development Team >>>>> We have set up a Geo-replication session with non-root in slave setup >>>>> in our DC. >>>>> It was working well with Active Status and Changelogcrawl. >>>>> >>>>> We were mounting the master node and the file is being written in it. >>>>> We were running some process as the root user so the process wrote >>>>> some file and folder with root permission. >>>>> After stopping the geo-replication and starting the process the >>>>> session went to the faulty state. >>>>> How to recover? >>>>> >>>> >>>> >>>> -- >>>> regards >>>> Aravinda VK >>>> >>> >> >> -- >> Thanks and Regards, >> Kotresh H R >> >-- Thanks and Regards, Kotresh H R -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20191119/c2949aa6/attachment.html>