thr3ads.net - Gluster users - [Gluster-users] Upgrade to 4.1.1 geo-replication does not work [Jul 2018]

If this information is useful, please help other people find it:
Share via:

Kotresh Hiremath Ravishankar

2018-Jul-18 03:58 UTC

[Gluster-users] Upgrade to 4.1.1 geo-replication does not work

Hi Marcus,

I am testing out 4.1 myself and I will have some update today.
For this particular traceback, gsyncd is not able to find the library.
Is it the rpm install? If so, gluster libraries would be in /usr/lib.
Please run the cmd below.

#ldconfig /usr/lib
#ldconfig -p /usr/lib | grep libgf  (This should list libgfchangelog.so)

Geo-rep should be fixed automatically.

Thanks,
Kotresh HR

On Wed, Jul 18, 2018 at 1:27 AM, Marcus Peders?n <marcus.pedersen at
slu.se>
wrote:
> Hi again,
>
> I continue to do some testing, but now I have come to a stage where I need
> help.
>
>
> gsyncd.log was complaining about that /usr/local/sbin/gluster was missing
> so I made a link.
>
> After that /usr/local/sbin/glusterfs was missing so I made a link there as
> well.
>
> Both links were done on all slave nodes.
>
>
> Now I have a new error that I can not resolve myself.
>
> It can not open libgfchangelog.so
>
>
> Many thanks!
>
> Regards
>
> Marcus Peders?n
>
>
> Part of gsyncd.log:
>
> OSError: libgfchangelog.so: cannot open shared object file: No such file
> or directory
> [2018-07-17 19:32:06.517106] I [repce(agent
/urd-gds/gluster):89:service_loop]
> RepceServer: terminating on reaching EOF.
> [2018-07-17 19:32:07.479553] I [monitor(monitor):272:monitor] Monitor:
> worker died in startup phase     brick=/urd-gds/gluster
> [2018-07-17 19:32:17.500709] I [monitor(monitor):158:monitor] Monitor:
> starting gsyncd worker   brick=/urd-gds/gluster  slave_node=urd-gds-geo-000
> [2018-07-17 19:32:17.541547] I [gsyncd(agent /urd-gds/gluster):297:main]
> <top>: Using session config file       path=/var/lib/glusterd/geo-
> replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
> [2018-07-17 19:32:17.541959] I [gsyncd(worker /urd-gds/gluster):297:main]
> <top>: Using session config file      path=/var/lib/glusterd/geo-
> replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
> [2018-07-17 19:32:17.542363] I [changelogagent(agent
> /urd-gds/gluster):72:__init__] ChangelogAgent: Agent listining...
> [2018-07-17 19:32:17.550894] I [resource(worker
/urd-gds/gluster):1348:connect_remote]
> SSH: Initializing SSH connection between master and slave...
> [2018-07-17 19:32:19.166246] I [resource(worker
/urd-gds/gluster):1395:connect_remote]
> SSH: SSH connection between master and slave established.
> duration=1.6151
> [2018-07-17 19:32:19.166806] I [resource(worker
/urd-gds/gluster):1067:connect]
> GLUSTER: Mounting gluster volume locally...
> [2018-07-17 19:32:20.257344] I [resource(worker
/urd-gds/gluster):1090:connect]
> GLUSTER: Mounted gluster volume duration=1.0901
> [2018-07-17 19:32:20.257921] I [subcmds(worker
/urd-gds/gluster):70:subcmd_worker]
> <top>: Worker spawn successful. Acknowledging back to monitor
> [2018-07-17 19:32:20.274647] E [repce(agent /urd-gds/gluster):114:worker]
> <top>: call failed:
> Traceback (most recent call last):
>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line
110, in
> worker
>     res = getattr(self.obj, rmeth)(*in_data[2:])
>   File
"/usr/libexec/glusterfs/python/syncdaemon/changelogagent.py", line
> 37, in init
>     return Changes.cl_init()
>   File
"/usr/libexec/glusterfs/python/syncdaemon/changelogagent.py", line
> 21, in __getattr__
>     from libgfchangelog import Changes as LChanges
>   File
"/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line
> 17, in <module>
>     class Changes(object):
>   File
"/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line
> 19, in Changes
>     use_errno=True)
>   File "/usr/lib64/python2.7/ctypes/__init__.py", line 360, in
__init__
>     self._handle = _dlopen(self._name, mode)
> OSError: libgfchangelog.so: cannot open shared object file: No such file
> or directory
> [2018-07-17 19:32:20.275093] E [repce(worker
/urd-gds/gluster):206:__call__]
> RepceClient: call failed   call=6078:139982918485824:1531855940.27
> method=init     error=OSError
> [2018-07-17 19:32:20.275192] E [syncdutils(worker
> /urd-gds/gluster):330:log_raise_exception] <top>: FAIL:
> Traceback (most recent call last):
>   File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line
311, in
> main
>     func(args)
>   File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py",
line 72, in
> subcmd_worker
>     local.service_loop(remote)
>   File "/usr/libexec/glusterfs/python/syncdaemon/resource.py",
line 1236,
> in service_loop
>     changelog_agent.init()
>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line
225, in
> __call__
>     return self.ins(self.meth, *a)
>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line
207, in
> __call__
>     raise res
> OSError: libgfchangelog.so: cannot open shared object file: No such file
> or directory
> [2018-07-17 19:32:20.286787] I [repce(agent
/urd-gds/gluster):89:service_loop]
> RepceServer: terminating on reaching EOF.
> [2018-07-17 19:32:21.259891] I [monitor(monitor):272:monitor] Monitor:
> worker died in startup phase     brick=/urd-gds/gluster
>
>
>
> ------------------------------
> *Fr?n:* gluster-users-bounces at gluster.org <gluster-users-bounces@
> gluster.org> f?r Marcus Peders?n <marcus.pedersen at slu.se>
> *Skickat:* den 16 juli 2018 21:59
> *Till:* khiremat at redhat.com
>
> *Kopia:* gluster-users at gluster.org
> *?mne:* Re: [Gluster-users] Upgrade to 4.1.1 geo-replication does not work
>
>
> Hi Kotresh,
>
> I have been testing for a bit and as you can see from the logs I sent
> before permission is denied for geouser on slave node on file:
>
> /var/log/glusterfs/cli.log
>
> I have turned selinux off and just for testing I changed permissions on
> /var/log/glusterfs/cli.log so geouser can access it.
>
> Starting geo-replication after that gives response successful but all
> nodes get status Faulty.
>
>
> If I run: gluster-mountbroker status
>
> I get:
>
> +-----------------------------+-------------+---------------
> ------------+--------------+--------------------------+
> |             NODE            | NODE STATUS |         MOUNT ROOT
> |    GROUP     |          USERS           |
> +-----------------------------+-------------+---------------
> ------------+--------------+--------------------------+
> | urd-gds-geo-001.hgen.slu.se |          UP | /var/mountbroker-root(OK) |
> geogroup(OK) | geouser(urd-gds-volume)  |
> |       urd-gds-geo-002       |          UP | /var/mountbroker-root(OK) |
> geogroup(OK) | geouser(urd-gds-volume)  |
> |          localhost          |          UP | /var/mountbroker-root(OK) |
> geogroup(OK) | geouser(urd-gds-volume)  |
> +-----------------------------+-------------+---------------
> ------------+--------------+--------------------------+
>
>
> and that is all nodes on slave cluster, so mountbroker seems ok.
>
>
> gsyncd.log logs an error about /usr/local/sbin/gluster is missing.
>
> That is correct cos gluster is in /sbin/gluster and /urs/sbin/gluster
>
> Another error is that SSH between master and slave is broken,
>
> but now when I have changed permission on /var/log/glusterfs/cli.log I can
> run:
>
> ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i
> /var/lib/glusterd/geo-replication/secret.pem -p 22 geouser at
urd-gds-geo-001
> gluster --xml --remote-host=localhost volume info urd-gds-volume
>
> as geouser and that works, which means that the ssh connection works.
>
>
> Is the permissions on /var/log/glusterfs/cli.log changed when
> geo-replication is setup?
>
> Is gluster supposed to be in /usr/local/sbin/gluster?
>
>
> Do I have any options or should I remove current geo-replication and
> create a new?
>
> How much do I need to clean up before creating a new geo-replication?
>
> In that case can I pause geo-replication, mount slave cluster on master
> cluster and run rsync , just to speed up transfer of files?
>
>
> Many thanks in advance!
>
> Marcus Peders?n
>
>
> Part from the gsyncd.log:
>
> [2018-07-16 19:34:56.26287] E [syncdutils(worker
> /urd-gds/gluster):749:errlog] Popen: command returned error    cmd=ssh
> -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i
> /var/lib/glusterd/geo-replicatio\
> n/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-WrbZ22/
> bf60c68f1a195dad59573a8dbaa309f2.sock geouser at urd-gds-geo-001
> /nonexistent/gsyncd slave urd-gds-volume geouser at urd-gds-geo-001::urd-
> gds-volu\
> me --master-node urd-gds-001 --master-node-id
912bebfd-1a7f-44dc-b0b7-f001a20d58cd
> --master-brick /urd-gds/gluster --local-node urd-gds-geo-000
> --local-node-id 03075698-2bbf-43e4-a99a-65fe82f61794 --slave-timeo\
> ut 120 --slave-log-level INFO --slave-gluster-log-level INFO
> --slave-gluster-command-dir /usr/local/sbin/ error=1
> [2018-07-16 19:34:56.26583] E [syncdutils(worker
> /urd-gds/gluster):753:logerr] Popen: ssh> failure: execution of
> "/usr/local/sbin/gluster" failed with ENOENT (No such file or
directory)
> [2018-07-16 19:34:56.33901] I [repce(agent
/urd-gds/gluster):89:service_loop]
> RepceServer: terminating on reaching EOF.
> [2018-07-16 19:34:56.34307] I [monitor(monitor):262:monitor] Monitor:
> worker died before establishing connection        brick=/urd-gds/gluster
> [2018-07-16 19:35:06.59412] I [monitor(monitor):158:monitor] Monitor:
> starting gsyncd worker    brick=/urd-gds/gluster 
slave_node=urd-gds-geo-000
> [2018-07-16 19:35:06.99509] I [gsyncd(worker /urd-gds/gluster):297:main]
> <top>: Using session config file       path=/var/lib/glusterd/geo-
> replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
> [2018-07-16 19:35:06.99561] I [gsyncd(agent /urd-gds/gluster):297:main]
> <top>: Using session config file        path=/var/lib/glusterd/geo-
> replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
> [2018-07-16 19:35:06.100481] I [changelogagent(agent
> /urd-gds/gluster):72:__init__] ChangelogAgent: Agent listining...
> [2018-07-16 19:35:06.108834] I [resource(worker
/urd-gds/gluster):1348:connect_remote]
> SSH: Initializing SSH connection between master and slave...
> [2018-07-16 19:35:06.762320] E [syncdutils(worker
> /urd-gds/gluster):303:log_raise_exception] <top>: connection to peer
is
> broken
> [2018-07-16 19:35:06.763103] E [syncdutils(worker
> /urd-gds/gluster):749:errlog] Popen: command returned error   cmd=ssh
> -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i
> /var/lib/glusterd/geo-replicatio\
> n/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-K9mB6Q/
> bf60c68f1a195dad59573a8dbaa309f2.sock geouser at urd-gds-geo-001
> /nonexistent/gsyncd slave urd-gds-volume geouser at urd-gds-geo-001::urd-
> gds-volu\
> me --master-node urd-gds-001 --master-node-id
912bebfd-1a7f-44dc-b0b7-f001a20d58cd
> --master-brick /urd-gds/gluster --local-node urd-gds-geo-000
> --local-node-id 03075698-2bbf-43e4-a99a-65fe82f61794 --slave-timeo\
> ut 120 --slave-log-level INFO --slave-gluster-log-level INFO
> --slave-gluster-command-dir /usr/local/sbin/ error=1
> [2018-07-16 19:35:06.763398] E [syncdutils(worker
> /urd-gds/gluster):753:logerr] Popen: ssh> failure: execution of
> "/usr/local/sbin/gluster" failed with ENOENT (No such file or
directory)
> [2018-07-16 19:35:06.771905] I [repce(agent
/urd-gds/gluster):89:service_loop]
> RepceServer: terminating on reaching EOF.
> [2018-07-16 19:35:06.772272] I [monitor(monitor):262:monitor] Monitor:
> worker died before establishing connection       brick=/urd-gds/gluster
> [2018-07-16 19:35:16.786387] I [monitor(monitor):158:monitor] Monitor:
> starting gsyncd worker   brick=/urd-gds/gluster  slave_node=urd-gds-geo-000
> [2018-07-16 19:35:16.828056] I [gsyncd(worker /urd-gds/gluster):297:main]
> <top>: Using session config file      path=/var/lib/glusterd/geo-
> replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
> [2018-07-16 19:35:16.828066] I [gsyncd(agent /urd-gds/gluster):297:main]
> <top>: Using session config file       path=/var/lib/glusterd/geo-
> replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
> [2018-07-16 19:35:16.828912] I [changelogagent(agent
> /urd-gds/gluster):72:__init__] ChangelogAgent: Agent listining...
> [2018-07-16 19:35:16.837100] I [resource(worker
/urd-gds/gluster):1348:connect_remote]
> SSH: Initializing SSH connection between master and slave...
> [2018-07-16 19:35:17.260257] E [syncdutils(worker
> /urd-gds/gluster):303:log_raise_exception] <top>: connection to peer
is
> broken
>
> ------------------------------
> *Fr?n:* gluster-users-bounces at gluster.org <gluster-users-bounces@
> gluster.org> f?r Marcus Peders?n <marcus.pedersen at slu.se>
> *Skickat:* den 13 juli 2018 14:50
> *Till:* Kotresh Hiremath Ravishankar
> *Kopia:* gluster-users at gluster.org
> *?mne:* Re: [Gluster-users] Upgrade to 4.1.1 geo-replication does not work
>
> Hi Kotresh,
> Yes, all nodes have the same version 4.1.1 both master and slave.
> All glusterd are crashing on the master side.
> Will send logs tonight.
>
> Thanks,
> Marcus
>
> ################
> Marcus Peders?n
> Systemadministrator
> Interbull Centre
> ################
> Sent from my phone
> ################
>
> Den 13 juli 2018 11:28 skrev Kotresh Hiremath Ravishankar <
> khiremat at redhat.com>:
>
> Hi Marcus,
>
> Is the gluster geo-rep version is same on both master and slave?
>
> Thanks,
> Kotresh HR
>
> On Fri, Jul 13, 2018 at 1:26 AM, Marcus Peders?n <marcus.pedersen at
slu.se>
> wrote:
>
> Hi Kotresh,
>
> i have replaced both files (gsyncdconfig.py
>
<https://review.gluster.org/#/c/20207/1/geo-replication/syncdaemon/gsyncdconfig.py>
> and repce.py
>
<https://review.gluster.org/#/c/20207/1/geo-replication/syncdaemon/repce.py>)
> in all nodes both master and slave.
>
> I rebooted all servers but geo-replication status is still Stopped.
>
> I tried to start geo-replication with response Successful but status still
> show Stopped on all nodes.
>
> Nothing has been written to geo-replication logs since I sent the tail of
> the log.
>
> So I do not know what info to provide?
>
>
> Please, help me to find a way to solve this.
>
>
> Thanks!
>
>
> Regards
>
> Marcus
>
>
> ------------------------------
> *Fr?n:* gluster-users-bounces at gluster.org <gluster-users-bounces at
gluster
> .org> f?r Marcus Peders?n <marcus.pedersen at slu.se>
> *Skickat:* den 12 juli 2018 08:51
> *Till:* Kotresh Hiremath Ravishankar
> *Kopia:* gluster-users at gluster.org
> *?mne:* Re: [Gluster-users] Upgrade to 4.1.1 geo-replication does not work
>
> Thanks Kotresh,
> I installed through the official centos channel, centos-release-gluster41.
> Isn't this fix included in centos install?
> I will have a look, test it tonight and come back to you!
>
> Thanks a lot!
>
> Regards
> Marcus
>
> ################
> Marcus Peders?n
> Systemadministrator
> Interbull Centre
> ################
> Sent from my phone
> ################
>
> Den 12 juli 2018 07:41 skrev Kotresh Hiremath Ravishankar <
> khiremat at redhat.com>:
>
> Hi Marcus,
>
> I think the fix [1] is needed in 4.1
> Could you please this out and let us know if that works for you?
>
> [1] https://review.gluster.org/#/c/20207/
>
> Thanks,
> Kotresh HR
>
> On Thu, Jul 12, 2018 at 1:49 AM, Marcus Peders?n <marcus.pedersen at
slu.se>
> wrote:
>
> Hi all,
>
> I have upgraded from 3.12.9 to 4.1.1 and been following upgrade
> instructions for offline upgrade.
>
> I upgraded geo-replication side first 1 x (2+1) and the master side after
> that 2 x (2+1).
>
> Both clusters works the way they should on their own.
>
> After upgrade on master side status for all geo-replication nodes
> is Stopped.
>
> I tried to start the geo-replication from master node and response back
> was started successfully.
>
> Status again .... Stopped
>
> Tried to start again and get response started successfully, after that all
> glusterd crashed on all master nodes.
>
> After a restart of all glusterd the master cluster was up again.
>
> Status for geo-replication is still Stopped and every try to start it
> after this gives the response successful but still status Stopped.
>
>
> Please help me get the geo-replication up and running again.
>
>
> Best regards
>
> Marcus Peders?n
>
>
> Part of geo-replication log from master node:
>
> [2018-07-11 18:42:48.941760] I
[changelogagent(/urd-gds/gluster):73:__init__]
> ChangelogAgent: Agent listining...
> [2018-07-11 18:42:48.947567] I
[resource(/urd-gds/gluster):1780:connect_remote]
> SSH: Initializing SSH connection between master and slave...
> [2018-07-11 18:42:49.363514] E
[syncdutils(/urd-gds/gluster):304:log_raise_exception]
> <top>: connection to peer is broken
> [2018-07-11 18:42:49.364279] E [resource(/urd-gds/gluster):210:errlog]
> Popen: command returned error    cmd=ssh -oPasswordAuthentication=no
> -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret\
> .pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-hjRhBo/7e5
> 534547f3675a710a107722317484f.sock geouser at urd-gds-geo-000
> /nonexistent/gsyncd --session-owner 5e94eb7d-219f-4741-a179-d4ae6b50c7ee
> --local-id .%\
> 2Furd-gds%2Fgluster --local-node urd-gds-001 -N --listen --timeout 120
> gluster://localhost:urd-gds-volume   error=2
> [2018-07-11 18:42:49.364586] E [resource(/urd-gds/gluster):214:logerr]
> Popen: ssh> usage: gsyncd.py [-h]
> [2018-07-11 18:42:49.364799] E [resource(/urd-gds/gluster):214:logerr]
> Popen: ssh>
> [2018-07-11 18:42:49.364989] E [resource(/urd-gds/gluster):214:logerr]
> Popen: ssh>                  {monitor-status,monitor,worker
> ,agent,slave,status,config-check,config-get,config-set,confi
> g-reset,voluuidget,d\
> elete}
> [2018-07-11 18:42:49.365210] E [resource(/urd-gds/gluster):214:logerr]
> Popen: ssh>                  ...
> [2018-07-11 18:42:49.365408] E [resource(/urd-gds/gluster):214:logerr]
> Popen: ssh> gsyncd.py: error: argument subcmd: invalid choice:
> '5e94eb7d-219f-4741-a179-d4ae6b50c7ee' (choose from
'monitor-status',
> 'monit\
> or', 'worker', 'agent', 'slave',
'status', 'config-check', 'config-get',
> 'config-set', 'config-reset', 'voluuidget',
'delete')
> [2018-07-11 18:42:49.365919] I [syncdutils(/urd-gds/gluster):271:finalize]
> <top>: exiting.
> [2018-07-11 18:42:49.369316] I [repce(/urd-gds/gluster):92:service_loop]
> RepceServer: terminating on reaching EOF.
> [2018-07-11 18:42:49.369921] I [syncdutils(/urd-gds/gluster):271:finalize]
> <top>: exiting.
> [2018-07-11 18:42:49.369694] I [monitor(monitor):353:monitor] Monitor:
> worker died before establishing connection       brick=/urd-gds/gluster
> [2018-07-11 18:42:59.492762] I [monitor(monitor):280:monitor] Monitor:
> starting gsyncd worker   brick=/urd-gds/gluster
> slave_node=ssh://geouser at urd-gds-geo-000:gluster://localhost
> :urd-gds-volume
> [2018-07-11 18:42:59.558491] I
[resource(/urd-gds/gluster):1780:connect_remote]
> SSH: Initializing SSH connection between master and slave...
> [2018-07-11 18:42:59.559056] I
[changelogagent(/urd-gds/gluster):73:__init__]
> ChangelogAgent: Agent listining...
> [2018-07-11 18:42:59.945693] E
[syncdutils(/urd-gds/gluster):304:log_raise_exception]
> <top>: connection to peer is broken
> [2018-07-11 18:42:59.946439] E [resource(/urd-gds/gluster):210:errlog]
> Popen: command returned error    cmd=ssh -oPasswordAuthentication=no
> -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret\
> .pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-992bk7/7e5
> 534547f3675a710a107722317484f.sock geouser at urd-gds-geo-000
> /nonexistent/gsyncd --session-owner 5e94eb7d-219f-4741-a179-d4ae6b50c7ee
> --local-id .%\
> 2Furd-gds%2Fgluster --local-node urd-gds-001 -N --listen --timeout 120
> gluster://localhost:urd-gds-volume   error=2
> [2018-07-11 18:42:59.946748] E [resource(/urd-gds/gluster):214:logerr]
> Popen: ssh> usage: gsyncd.py [-h]
> [2018-07-11 18:42:59.946962] E [resource(/urd-gds/gluster):214:logerr]
> Popen: ssh>
> [2018-07-11 18:42:59.947150] E [resource(/urd-gds/gluster):214:logerr]
> Popen: ssh>                  {monitor-status,monitor,worker
> ,agent,slave,status,config-check,config-get,config-set,confi
> g-reset,voluuidget,d\
> elete}
> [2018-07-11 18:42:59.947369] E [resource(/urd-gds/gluster):214:logerr]
> Popen: ssh>                  ...
> [2018-07-11 18:42:59.947552] E [resource(/urd-gds/gluster):214:logerr]
> Popen: ssh> gsyncd.py: error: argument subcmd: invalid choice:
> '5e94eb7d-219f-4741-a179-d4ae6b50c7ee' (choose from
'monitor-status',
> 'monit\
> or', 'worker', 'agent', 'slave',
'status', 'config-check', 'config-get',
> 'config-set', 'config-reset', 'voluuidget',
'delete')
> [2018-07-11 18:42:59.948046] I [syncdutils(/urd-gds/gluster):271:finalize]
> <top>: exiting.
> [2018-07-11 18:42:59.951392] I [repce(/urd-gds/gluster):92:service_loop]
> RepceServer: terminating on reaching EOF.
> [2018-07-11 18:42:59.951760] I [syncdutils(/urd-gds/gluster):271:finalize]
> <top>: exiting.
> [2018-07-11 18:42:59.951817] I [monitor(monitor):353:monitor] Monitor:
> worker died before establishing connection       brick=/urd-gds/gluster
> [2018-07-11 18:43:10.54580] I [monitor(monitor):280:monitor] Monitor:
> starting gsyncd worker    brick=/urd-gds/gluster
> slave_node=ssh://geouser at urd-gds-geo-000:gluster://localhost
> :urd-gds-volume
> [2018-07-11 18:43:10.88356] I [monitor(monitor):345:monitor] Monitor:
> Changelog Agent died, Aborting Worker     brick=/urd-gds/gluster
> [2018-07-11 18:43:10.88613] I [monitor(monitor):353:monitor] Monitor:
> worker died before establishing connection        brick=/urd-gds/gluster
> [2018-07-11 18:43:20.112435] I
[gsyncdstatus(monitor):242:set_worker_status]
> GeorepStatus: Worker Status Change status=inconsistent
> [2018-07-11 18:43:20.112885] E
[syncdutils(monitor):331:log_raise_exception]
> <top>: FAIL:
> Traceback (most recent call last):
>   File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py",
line
> 361, in twrap
>     except:
>   File "/usr/libexec/glusterfs/python/syncdaemon/monitor.py",
line 428,
> in wmon
>     sys.exit()
> TypeError: 'int' object is not iterable
> [2018-07-11 18:43:20.114610] I [syncdutils(monitor):271:finalize]
<top>:
> exiting.
>
> ---
> N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
> personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
> <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
> E-mailing SLU will result in SLU processing your personal data. For more
> information on how this is done, click here
> <https://www.slu.se/en/about-slu/contact-slu/personal-data/>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
> --
> Thanks and Regards,
> Kotresh H R
>
>
> ---
> N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
> personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
> <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
> E-mailing SLU will result in SLU processing your personal data. For more
> information on how this is done, click here
> <https://www.slu.se/en/about-slu/contact-slu/personal-data/>
>
> ---
> N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
> personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
> <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
> E-mailing SLU will result in SLU processing your personal data. For more
> information on how this is done, click here
> <https://www.slu.se/en/about-slu/contact-slu/personal-data/>
>
>
>
>
> --
> Thanks and Regards,
> Kotresh H R
>
>
> ---
> N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
> personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
> <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
> E-mailing SLU will result in SLU processing your personal data. For more
> information on how this is done, click here
> <https://www.slu.se/en/about-slu/contact-slu/personal-data/>
>
> ---
> N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
> personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
> <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
> E-mailing SLU will result in SLU processing your personal data. For more
> information on how this is done, click here
> <https://www.slu.se/en/about-slu/contact-slu/personal-data/>
>
> ---
> N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
> personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
> <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
> E-mailing SLU will result in SLU processing your personal data. For more
> information on how this is done, click here
> <https://www.slu.se/en/about-slu/contact-slu/personal-data/>
>


-- 
Thanks and Regards,
Kotresh H R
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180718/2f3ae464/attachment.html>

Kotresh Hiremath Ravishankar

2018-Jul-18 04:05 UTC

head link

[Gluster-users] Upgrade to 4.1.1 geo-replication does not work

Hi Marcus,

Well there is nothing wrong in setting up a symlink for gluster binary
location, but
there is a geo-rep command to set it so that gsyncd will search there.

To set on master
#gluster vol geo-rep <mastervol> <slave-vol> config
gluster-command-dir
<gluster-binary-location>

To set on slave
#gluster vol geo-rep <mastervol> <slave-vol> config
slave-gluster-command-dir <gluster-binary-location>

Thanks,
Kotresh HR


On Wed, Jul 18, 2018 at 9:28 AM, Kotresh Hiremath Ravishankar <
khiremat at redhat.com> wrote:
> Hi Marcus,
>
> I am testing out 4.1 myself and I will have some update today.
> For this particular traceback, gsyncd is not able to find the library.
> Is it the rpm install? If so, gluster libraries would be in /usr/lib.
> Please run the cmd below.
>
> #ldconfig /usr/lib
> #ldconfig -p /usr/lib | grep libgf  (This should list libgfchangelog.so)
>
> Geo-rep should be fixed automatically.
>
> Thanks,
> Kotresh HR
>
> On Wed, Jul 18, 2018 at 1:27 AM, Marcus Peders?n <marcus.pedersen at
slu.se>
> wrote:
>
>> Hi again,
>>
>> I continue to do some testing, but now I have come to a stage where I
>> need help.
>>
>>
>> gsyncd.log was complaining about that /usr/local/sbin/gluster was
missing
>> so I made a link.
>>
>> After that /usr/local/sbin/glusterfs was missing so I made a link there
>> as well.
>>
>> Both links were done on all slave nodes.
>>
>>
>> Now I have a new error that I can not resolve myself.
>>
>> It can not open libgfchangelog.so
>>
>>
>> Many thanks!
>>
>> Regards
>>
>> Marcus Peders?n
>>
>>
>> Part of gsyncd.log:
>>
>> OSError: libgfchangelog.so: cannot open shared object file: No such
file
>> or directory
>> [2018-07-17 19:32:06.517106] I [repce(agent
/urd-gds/gluster):89:service_loop]
>> RepceServer: terminating on reaching EOF.
>> [2018-07-17 19:32:07.479553] I [monitor(monitor):272:monitor] Monitor:
>> worker died in startup phase     brick=/urd-gds/gluster
>> [2018-07-17 19:32:17.500709] I [monitor(monitor):158:monitor] Monitor:
>> starting gsyncd worker   brick=/urd-gds/gluster 
slave_node=urd-gds-geo-000
>> [2018-07-17 19:32:17.541547] I [gsyncd(agent
/urd-gds/gluster):297:main]
>> <top>: Using session config file      
path=/var/lib/glusterd/geo-rep
>> lication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
>> [2018-07-17 19:32:17.541959] I [gsyncd(worker
/urd-gds/gluster):297:main]
>> <top>: Using session config file     
path=/var/lib/glusterd/geo-rep
>> lication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
>> [2018-07-17 19:32:17.542363] I [changelogagent(agent
>> /urd-gds/gluster):72:__init__] ChangelogAgent: Agent listining...
>> [2018-07-17 19:32:17.550894] I [resource(worker
>> /urd-gds/gluster):1348:connect_remote] SSH: Initializing SSH connection
>> between master and slave...
>> [2018-07-17 19:32:19.166246] I [resource(worker
>> /urd-gds/gluster):1395:connect_remote] SSH: SSH connection between
>> master and slave established.        duration=1.6151
>> [2018-07-17 19:32:19.166806] I [resource(worker
>> /urd-gds/gluster):1067:connect] GLUSTER: Mounting gluster volume
>> locally...
>> [2018-07-17 19:32:20.257344] I [resource(worker
>> /urd-gds/gluster):1090:connect] GLUSTER: Mounted gluster volume
>> duration=1.0901
>> [2018-07-17 19:32:20.257921] I [subcmds(worker
>> /urd-gds/gluster):70:subcmd_worker] <top>: Worker spawn
successful.
>> Acknowledging back to monitor
>> [2018-07-17 19:32:20.274647] E [repce(agent
/urd-gds/gluster):114:worker]
>> <top>: call failed:
>> Traceback (most recent call last):
>>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py",
line 110, in
>> worker
>>     res = getattr(self.obj, rmeth)(*in_data[2:])
>>   File
"/usr/libexec/glusterfs/python/syncdaemon/changelogagent.py",
>> line 37, in init
>>     return Changes.cl_init()
>>   File
"/usr/libexec/glusterfs/python/syncdaemon/changelogagent.py",
>> line 21, in __getattr__
>>     from libgfchangelog import Changes as LChanges
>>   File
"/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py",
>> line 17, in <module>
>>     class Changes(object):
>>   File
"/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py",
>> line 19, in Changes
>>     use_errno=True)
>>   File "/usr/lib64/python2.7/ctypes/__init__.py", line 360,
in __init__
>>     self._handle = _dlopen(self._name, mode)
>> OSError: libgfchangelog.so: cannot open shared object file: No such
file
>> or directory
>> [2018-07-17 19:32:20.275093] E [repce(worker
>> /urd-gds/gluster):206:__call__] RepceClient: call failed
>> call=6078:139982918485824:1531855940.27 method=init     error=OSError
>> [2018-07-17 19:32:20.275192] E [syncdutils(worker
>> /urd-gds/gluster):330:log_raise_exception] <top>: FAIL:
>> Traceback (most recent call last):
>>   File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py",
line 311,
>> in main
>>     func(args)
>>   File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py",
line 72,
>> in subcmd_worker
>>     local.service_loop(remote)
>>   File
"/usr/libexec/glusterfs/python/syncdaemon/resource.py", line
>> 1236, in service_loop
>>     changelog_agent.init()
>>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py",
line 225, in
>> __call__
>>     return self.ins(self.meth, *a)
>>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py",
line 207, in
>> __call__
>>     raise res
>> OSError: libgfchangelog.so: cannot open shared object file: No such
file
>> or directory
>> [2018-07-17 19:32:20.286787] I [repce(agent
/urd-gds/gluster):89:service_loop]
>> RepceServer: terminating on reaching EOF.
>> [2018-07-17 19:32:21.259891] I [monitor(monitor):272:monitor] Monitor:
>> worker died in startup phase     brick=/urd-gds/gluster
>>
>>
>>
>> ------------------------------
>> *Fr?n:* gluster-users-bounces at gluster.org <gluster-users-bounces
at gluster
>> .org> f?r Marcus Peders?n <marcus.pedersen at slu.se>
>> *Skickat:* den 16 juli 2018 21:59
>> *Till:* khiremat at redhat.com
>>
>> *Kopia:* gluster-users at gluster.org
>> *?mne:* Re: [Gluster-users] Upgrade to 4.1.1 geo-replication does not
>> work
>>
>>
>> Hi Kotresh,
>>
>> I have been testing for a bit and as you can see from the logs I sent
>> before permission is denied for geouser on slave node on file:
>>
>> /var/log/glusterfs/cli.log
>>
>> I have turned selinux off and just for testing I changed permissions on
>> /var/log/glusterfs/cli.log so geouser can access it.
>>
>> Starting geo-replication after that gives response successful but all
>> nodes get status Faulty.
>>
>>
>> If I run: gluster-mountbroker status
>>
>> I get:
>>
>> +-----------------------------+-------------+---------------
>> ------------+--------------+--------------------------+
>> |             NODE            | NODE STATUS |         MOUNT ROOT
>> |    GROUP     |          USERS           |
>> +-----------------------------+-------------+---------------
>> ------------+--------------+--------------------------+
>> | urd-gds-geo-001.hgen.slu.se |          UP | /var/mountbroker-root(OK)
>> | geogroup(OK) | geouser(urd-gds-volume)  |
>> |       urd-gds-geo-002       |          UP | /var/mountbroker-root(OK)
|
>> geogroup(OK) | geouser(urd-gds-volume)  |
>> |          localhost          |          UP | /var/mountbroker-root(OK)
|
>> geogroup(OK) | geouser(urd-gds-volume)  |
>> +-----------------------------+-------------+---------------
>> ------------+--------------+--------------------------+
>>
>>
>> and that is all nodes on slave cluster, so mountbroker seems ok.
>>
>>
>> gsyncd.log logs an error about /usr/local/sbin/gluster is missing.
>>
>> That is correct cos gluster is in /sbin/gluster and /urs/sbin/gluster
>>
>> Another error is that SSH between master and slave is broken,
>>
>> but now when I have changed permission on /var/log/glusterfs/cli.log I
>> can run:
>>
>> ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i
>> /var/lib/glusterd/geo-replication/secret.pem -p 22
>> geouser at urd-gds-geo-001 gluster --xml --remote-host=localhost volume
>> info urd-gds-volume
>>
>> as geouser and that works, which means that the ssh connection works.
>>
>>
>> Is the permissions on /var/log/glusterfs/cli.log changed when
>> geo-replication is setup?
>>
>> Is gluster supposed to be in /usr/local/sbin/gluster?
>>
>>
>> Do I have any options or should I remove current geo-replication and
>> create a new?
>>
>> How much do I need to clean up before creating a new geo-replication?
>>
>> In that case can I pause geo-replication, mount slave cluster on master
>> cluster and run rsync , just to speed up transfer of files?
>>
>>
>> Many thanks in advance!
>>
>> Marcus Peders?n
>>
>>
>> Part from the gsyncd.log:
>>
>> [2018-07-16 19:34:56.26287] E [syncdutils(worker
>> /urd-gds/gluster):749:errlog] Popen: command returned error    cmd=ssh
>> -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i
>> /var/lib/glusterd/geo-replicatio\
>> n/secret.pem -p 22 -oControlMaster=auto -S
/tmp/gsyncd-aux-ssh-WrbZ22/bf6
>> 0c68f1a195dad59573a8dbaa309f2.sock geouser at urd-gds-geo-001
>> /nonexistent/gsyncd slave urd-gds-volume geouser at
urd-gds-geo-001::urd-g
>> ds-volu\
>> me --master-node urd-gds-001 --master-node-id
>> 912bebfd-1a7f-44dc-b0b7-f001a20d58cd --master-brick /urd-gds/gluster
>> --local-node urd-gds-geo-000 --local-node-id
03075698-2bbf-43e4-a99a-65fe82f61794
>> --slave-timeo\
>> ut 120 --slave-log-level INFO --slave-gluster-log-level INFO
>> --slave-gluster-command-dir /usr/local/sbin/ error=1
>> [2018-07-16 19:34:56.26583] E [syncdutils(worker
>> /urd-gds/gluster):753:logerr] Popen: ssh> failure: execution of
>> "/usr/local/sbin/gluster" failed with ENOENT (No such file or
directory)
>> [2018-07-16 19:34:56.33901] I [repce(agent
/urd-gds/gluster):89:service_loop]
>> RepceServer: terminating on reaching EOF.
>> [2018-07-16 19:34:56.34307] I [monitor(monitor):262:monitor] Monitor:
>> worker died before establishing connection       
brick=/urd-gds/gluster
>> [2018-07-16 19:35:06.59412] I [monitor(monitor):158:monitor] Monitor:
>> starting gsyncd worker    brick=/urd-gds/gluster 
slave_node=urd-gds-geo-000
>> [2018-07-16 19:35:06.99509] I [gsyncd(worker
/urd-gds/gluster):297:main]
>> <top>: Using session config file      
path=/var/lib/glusterd/geo-rep
>> lication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
>> [2018-07-16 19:35:06.99561] I [gsyncd(agent /urd-gds/gluster):297:main]
>> <top>: Using session config file       
path=/var/lib/glusterd/geo-rep
>> lication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
>> [2018-07-16 19:35:06.100481] I [changelogagent(agent
>> /urd-gds/gluster):72:__init__] ChangelogAgent: Agent listining...
>> [2018-07-16 19:35:06.108834] I [resource(worker
>> /urd-gds/gluster):1348:connect_remote] SSH: Initializing SSH connection
>> between master and slave...
>> [2018-07-16 19:35:06.762320] E [syncdutils(worker
>> /urd-gds/gluster):303:log_raise_exception] <top>: connection to
peer is
>> broken
>> [2018-07-16 19:35:06.763103] E [syncdutils(worker
>> /urd-gds/gluster):749:errlog] Popen: command returned error   cmd=ssh
>> -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i
>> /var/lib/glusterd/geo-replicatio\
>> n/secret.pem -p 22 -oControlMaster=auto -S
/tmp/gsyncd-aux-ssh-K9mB6Q/bf6
>> 0c68f1a195dad59573a8dbaa309f2.sock geouser at urd-gds-geo-001
>> /nonexistent/gsyncd slave urd-gds-volume geouser at
urd-gds-geo-001::urd-g
>> ds-volu\
>> me --master-node urd-gds-001 --master-node-id
>> 912bebfd-1a7f-44dc-b0b7-f001a20d58cd --master-brick /urd-gds/gluster
>> --local-node urd-gds-geo-000 --local-node-id
03075698-2bbf-43e4-a99a-65fe82f61794
>> --slave-timeo\
>> ut 120 --slave-log-level INFO --slave-gluster-log-level INFO
>> --slave-gluster-command-dir /usr/local/sbin/ error=1
>> [2018-07-16 19:35:06.763398] E [syncdutils(worker
>> /urd-gds/gluster):753:logerr] Popen: ssh> failure: execution of
>> "/usr/local/sbin/gluster" failed with ENOENT (No such file or
directory)
>> [2018-07-16 19:35:06.771905] I [repce(agent
/urd-gds/gluster):89:service_loop]
>> RepceServer: terminating on reaching EOF.
>> [2018-07-16 19:35:06.772272] I [monitor(monitor):262:monitor] Monitor:
>> worker died before establishing connection       brick=/urd-gds/gluster
>> [2018-07-16 19:35:16.786387] I [monitor(monitor):158:monitor] Monitor:
>> starting gsyncd worker   brick=/urd-gds/gluster 
slave_node=urd-gds-geo-000
>> [2018-07-16 19:35:16.828056] I [gsyncd(worker
/urd-gds/gluster):297:main]
>> <top>: Using session config file     
path=/var/lib/glusterd/geo-rep
>> lication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
>> [2018-07-16 19:35:16.828066] I [gsyncd(agent
/urd-gds/gluster):297:main]
>> <top>: Using session config file      
path=/var/lib/glusterd/geo-rep
>> lication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
>> [2018-07-16 19:35:16.828912] I [changelogagent(agent
>> /urd-gds/gluster):72:__init__] ChangelogAgent: Agent listining...
>> [2018-07-16 19:35:16.837100] I [resource(worker
>> /urd-gds/gluster):1348:connect_remote] SSH: Initializing SSH connection
>> between master and slave...
>> [2018-07-16 19:35:17.260257] E [syncdutils(worker
>> /urd-gds/gluster):303:log_raise_exception] <top>: connection to
peer is
>> broken
>>
>> ------------------------------
>> *Fr?n:* gluster-users-bounces at gluster.org <gluster-users-bounces
at gluster
>> .org> f?r Marcus Peders?n <marcus.pedersen at slu.se>
>> *Skickat:* den 13 juli 2018 14:50
>> *Till:* Kotresh Hiremath Ravishankar
>> *Kopia:* gluster-users at gluster.org
>> *?mne:* Re: [Gluster-users] Upgrade to 4.1.1 geo-replication does not
>> work
>>
>> Hi Kotresh,
>> Yes, all nodes have the same version 4.1.1 both master and slave.
>> All glusterd are crashing on the master side.
>> Will send logs tonight.
>>
>> Thanks,
>> Marcus
>>
>> ################
>> Marcus Peders?n
>> Systemadministrator
>> Interbull Centre
>> ################
>> Sent from my phone
>> ################
>>
>> Den 13 juli 2018 11:28 skrev Kotresh Hiremath Ravishankar <
>> khiremat at redhat.com>:
>>
>> Hi Marcus,
>>
>> Is the gluster geo-rep version is same on both master and slave?
>>
>> Thanks,
>> Kotresh HR
>>
>> On Fri, Jul 13, 2018 at 1:26 AM, Marcus Peders?n <marcus.pedersen at
slu.se>
>> wrote:
>>
>> Hi Kotresh,
>>
>> i have replaced both files (gsyncdconfig.py
>>
<https://review.gluster.org/#/c/20207/1/geo-replication/syncdaemon/gsyncdconfig.py>
>> and repce.py
>>
<https://review.gluster.org/#/c/20207/1/geo-replication/syncdaemon/repce.py>)
>> in all nodes both master and slave.
>>
>> I rebooted all servers but geo-replication status is still Stopped.
>>
>> I tried to start geo-replication with response Successful but status
>> still show Stopped on all nodes.
>>
>> Nothing has been written to geo-replication logs since I sent the tail
of
>> the log.
>>
>> So I do not know what info to provide?
>>
>>
>> Please, help me to find a way to solve this.
>>
>>
>> Thanks!
>>
>>
>> Regards
>>
>> Marcus
>>
>>
>> ------------------------------
>> *Fr?n:* gluster-users-bounces at gluster.org <gluster-users-bounces
at gluster
>> .org> f?r Marcus Peders?n <marcus.pedersen at slu.se>
>> *Skickat:* den 12 juli 2018 08:51
>> *Till:* Kotresh Hiremath Ravishankar
>> *Kopia:* gluster-users at gluster.org
>> *?mne:* Re: [Gluster-users] Upgrade to 4.1.1 geo-replication does not
>> work
>>
>> Thanks Kotresh,
>> I installed through the official centos channel,
centos-release-gluster41.
>> Isn't this fix included in centos install?
>> I will have a look, test it tonight and come back to you!
>>
>> Thanks a lot!
>>
>> Regards
>> Marcus
>>
>> ################
>> Marcus Peders?n
>> Systemadministrator
>> Interbull Centre
>> ################
>> Sent from my phone
>> ################
>>
>> Den 12 juli 2018 07:41 skrev Kotresh Hiremath Ravishankar <
>> khiremat at redhat.com>:
>>
>> Hi Marcus,
>>
>> I think the fix [1] is needed in 4.1
>> Could you please this out and let us know if that works for you?
>>
>> [1] https://review.gluster.org/#/c/20207/
>>
>> Thanks,
>> Kotresh HR
>>
>> On Thu, Jul 12, 2018 at 1:49 AM, Marcus Peders?n <marcus.pedersen at
slu.se>
>> wrote:
>>
>> Hi all,
>>
>> I have upgraded from 3.12.9 to 4.1.1 and been following upgrade
>> instructions for offline upgrade.
>>
>> I upgraded geo-replication side first 1 x (2+1) and the master side
after
>> that 2 x (2+1).
>>
>> Both clusters works the way they should on their own.
>>
>> After upgrade on master side status for all geo-replication nodes
>> is Stopped.
>>
>> I tried to start the geo-replication from master node and response back
>> was started successfully.
>>
>> Status again .... Stopped
>>
>> Tried to start again and get response started successfully, after that
>> all glusterd crashed on all master nodes.
>>
>> After a restart of all glusterd the master cluster was up again.
>>
>> Status for geo-replication is still Stopped and every try to start it
>> after this gives the response successful but still status Stopped.
>>
>>
>> Please help me get the geo-replication up and running again.
>>
>>
>> Best regards
>>
>> Marcus Peders?n
>>
>>
>> Part of geo-replication log from master node:
>>
>> [2018-07-11 18:42:48.941760] I
[changelogagent(/urd-gds/gluster):73:__init__]
>> ChangelogAgent: Agent listining...
>> [2018-07-11 18:42:48.947567] I
[resource(/urd-gds/gluster):1780:connect_remote]
>> SSH: Initializing SSH connection between master and slave...
>> [2018-07-11 18:42:49.363514] E
[syncdutils(/urd-gds/gluster):304:log_raise_exception]
>> <top>: connection to peer is broken
>> [2018-07-11 18:42:49.364279] E [resource(/urd-gds/gluster):210:errlog]
>> Popen: command returned error    cmd=ssh -oPasswordAuthentication=no
>> -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret\
>> .pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-hjRhBo/7e5
>> 534547f3675a710a107722317484f.sock geouser at urd-gds-geo-000
>> /nonexistent/gsyncd --session-owner
5e94eb7d-219f-4741-a179-d4ae6b50c7ee
>> --local-id .%\
>> 2Furd-gds%2Fgluster --local-node urd-gds-001 -N --listen --timeout 120
>> gluster://localhost:urd-gds-volume   error=2
>> [2018-07-11 18:42:49.364586] E [resource(/urd-gds/gluster):214:logerr]
>> Popen: ssh> usage: gsyncd.py [-h]
>> [2018-07-11 18:42:49.364799] E [resource(/urd-gds/gluster):214:logerr]
>> Popen: ssh>
>> [2018-07-11 18:42:49.364989] E [resource(/urd-gds/gluster):214:logerr]
>> Popen: ssh>                  {monitor-status,monitor,worker
>> ,agent,slave,status,config-check,config-get,config-set,confi
>> g-reset,voluuidget,d\
>> elete}
>> [2018-07-11 18:42:49.365210] E [resource(/urd-gds/gluster):214:logerr]
>> Popen: ssh>                  ...
>> [2018-07-11 18:42:49.365408] E [resource(/urd-gds/gluster):214:logerr]
>> Popen: ssh> gsyncd.py: error: argument subcmd: invalid choice:
>> '5e94eb7d-219f-4741-a179-d4ae6b50c7ee' (choose from
'monitor-status',
>> 'monit\
>> or', 'worker', 'agent', 'slave',
'status', 'config-check', 'config-get',
>> 'config-set', 'config-reset', 'voluuidget',
'delete')
>> [2018-07-11 18:42:49.365919] I
[syncdutils(/urd-gds/gluster):271:finalize]
>> <top>: exiting.
>> [2018-07-11 18:42:49.369316] I
[repce(/urd-gds/gluster):92:service_loop]
>> RepceServer: terminating on reaching EOF.
>> [2018-07-11 18:42:49.369921] I
[syncdutils(/urd-gds/gluster):271:finalize]
>> <top>: exiting.
>> [2018-07-11 18:42:49.369694] I [monitor(monitor):353:monitor] Monitor:
>> worker died before establishing connection       brick=/urd-gds/gluster
>> [2018-07-11 18:42:59.492762] I [monitor(monitor):280:monitor] Monitor:
>> starting gsyncd worker   brick=/urd-gds/gluster
>> slave_node=ssh://geouser at urd-gds-geo-000:gluster://localhost
>> :urd-gds-volume
>> [2018-07-11 18:42:59.558491] I
[resource(/urd-gds/gluster):1780:connect_remote]
>> SSH: Initializing SSH connection between master and slave...
>> [2018-07-11 18:42:59.559056] I
[changelogagent(/urd-gds/gluster):73:__init__]
>> ChangelogAgent: Agent listining...
>> [2018-07-11 18:42:59.945693] E
[syncdutils(/urd-gds/gluster):304:log_raise_exception]
>> <top>: connection to peer is broken
>> [2018-07-11 18:42:59.946439] E [resource(/urd-gds/gluster):210:errlog]
>> Popen: command returned error    cmd=ssh -oPasswordAuthentication=no
>> -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret\
>> .pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-992bk7/7e5
>> 534547f3675a710a107722317484f.sock geouser at urd-gds-geo-000
>> /nonexistent/gsyncd --session-owner
5e94eb7d-219f-4741-a179-d4ae6b50c7ee
>> --local-id .%\
>> 2Furd-gds%2Fgluster --local-node urd-gds-001 -N --listen --timeout 120
>> gluster://localhost:urd-gds-volume   error=2
>> [2018-07-11 18:42:59.946748] E [resource(/urd-gds/gluster):214:logerr]
>> Popen: ssh> usage: gsyncd.py [-h]
>> [2018-07-11 18:42:59.946962] E [resource(/urd-gds/gluster):214:logerr]
>> Popen: ssh>
>> [2018-07-11 18:42:59.947150] E [resource(/urd-gds/gluster):214:logerr]
>> Popen: ssh>                  {monitor-status,monitor,worker
>> ,agent,slave,status,config-check,config-get,config-set,confi
>> g-reset,voluuidget,d\
>> elete}
>> [2018-07-11 18:42:59.947369] E [resource(/urd-gds/gluster):214:logerr]
>> Popen: ssh>                  ...
>> [2018-07-11 18:42:59.947552] E [resource(/urd-gds/gluster):214:logerr]
>> Popen: ssh> gsyncd.py: error: argument subcmd: invalid choice:
>> '5e94eb7d-219f-4741-a179-d4ae6b50c7ee' (choose from
'monitor-status',
>> 'monit\
>> or', 'worker', 'agent', 'slave',
'status', 'config-check', 'config-get',
>> 'config-set', 'config-reset', 'voluuidget',
'delete')
>> [2018-07-11 18:42:59.948046] I
[syncdutils(/urd-gds/gluster):271:finalize]
>> <top>: exiting.
>> [2018-07-11 18:42:59.951392] I
[repce(/urd-gds/gluster):92:service_loop]
>> RepceServer: terminating on reaching EOF.
>> [2018-07-11 18:42:59.951760] I
[syncdutils(/urd-gds/gluster):271:finalize]
>> <top>: exiting.
>> [2018-07-11 18:42:59.951817] I [monitor(monitor):353:monitor] Monitor:
>> worker died before establishing connection       brick=/urd-gds/gluster
>> [2018-07-11 18:43:10.54580] I [monitor(monitor):280:monitor] Monitor:
>> starting gsyncd worker    brick=/urd-gds/gluster
>> slave_node=ssh://geouser at urd-gds-geo-000:gluster://localhost
>> :urd-gds-volume
>> [2018-07-11 18:43:10.88356] I [monitor(monitor):345:monitor] Monitor:
>> Changelog Agent died, Aborting Worker     brick=/urd-gds/gluster
>> [2018-07-11 18:43:10.88613] I [monitor(monitor):353:monitor] Monitor:
>> worker died before establishing connection       
brick=/urd-gds/gluster
>> [2018-07-11 18:43:20.112435] I
[gsyncdstatus(monitor):242:set_worker_status]
>> GeorepStatus: Worker Status Change status=inconsistent
>> [2018-07-11 18:43:20.112885] E
[syncdutils(monitor):331:log_raise_exception]
>> <top>: FAIL:
>> Traceback (most recent call last):
>>   File
"/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line
>> 361, in twrap
>>     except:
>>   File "/usr/libexec/glusterfs/python/syncdaemon/monitor.py",
line 428,
>> in wmon
>>     sys.exit()
>> TypeError: 'int' object is not iterable
>> [2018-07-11 18:43:20.114610] I [syncdutils(monitor):271:finalize]
<top>:
>> exiting.
>>
>> ---
>> N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
>> personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
>> <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
>> E-mailing SLU will result in SLU processing your personal data. For
more
>> information on how this is done, click here
>> <https://www.slu.se/en/about-slu/contact-slu/personal-data/>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>>
>> --
>> Thanks and Regards,
>> Kotresh H R
>>
>>
>> ---
>> N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
>> personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
>> <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
>> E-mailing SLU will result in SLU processing your personal data. For
more
>> information on how this is done, click here
>> <https://www.slu.se/en/about-slu/contact-slu/personal-data/>
>>
>> ---
>> N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
>> personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
>> <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
>> E-mailing SLU will result in SLU processing your personal data. For
more
>> information on how this is done, click here
>> <https://www.slu.se/en/about-slu/contact-slu/personal-data/>
>>
>>
>>
>>
>> --
>> Thanks and Regards,
>> Kotresh H R
>>
>>
>> ---
>> N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
>> personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
>> <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
>> E-mailing SLU will result in SLU processing your personal data. For
more
>> information on how this is done, click here
>> <https://www.slu.se/en/about-slu/contact-slu/personal-data/>
>>
>> ---
>> N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
>> personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
>> <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
>> E-mailing SLU will result in SLU processing your personal data. For
more
>> information on how this is done, click here
>> <https://www.slu.se/en/about-slu/contact-slu/personal-data/>
>>
>> ---
>> N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
>> personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
>> <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
>> E-mailing SLU will result in SLU processing your personal data. For
more
>> information on how this is done, click here
>> <https://www.slu.se/en/about-slu/contact-slu/personal-data/>
>>
>
>
>
> --
> Thanks and Regards,
> Kotresh H R
>


-- 
Thanks and Regards,
Kotresh H R
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180718/921bc2b5/attachment.html>

Gluster users - Jul 2018 - Upgrade to 4.1.1 geo-replication does not work

[Gluster-users] Upgrade to 4.1.1 geo-replication does not work

[Gluster-users] Upgrade to 4.1.1 geo-replication does not work