Marcus Pedersén
2018-Jul-12 06:51 UTC
[Gluster-users] Upgrade to 4.1.1 geo-replication does not work
Thanks Kotresh,
I installed through the official centos channel, centos-release-gluster41.
Isn't this fix included in centos install?
I will have a look, test it tonight and come back to you!
Thanks a lot!
Regards
Marcus
################
Marcus Peders?n
Systemadministrator
Interbull Centre
################
Sent from my phone
################
Den 12 juli 2018 07:41 skrev Kotresh Hiremath Ravishankar <khiremat at
redhat.com>:
Hi Marcus,
I think the fix [1] is needed in 4.1
Could you please this out and let us know if that works for you?
[1] https://review.gluster.org/#/c/20207/
Thanks,
Kotresh HR
On Thu, Jul 12, 2018 at 1:49 AM, Marcus Peders?n <marcus.pedersen at
slu.se<mailto:marcus.pedersen at slu.se>> wrote:
Hi all,
I have upgraded from 3.12.9 to 4.1.1 and been following upgrade instructions for
offline upgrade.
I upgraded geo-replication side first 1 x (2+1) and the master side after that 2
x (2+1).
Both clusters works the way they should on their own.
After upgrade on master side status for all geo-replication nodes is Stopped.
I tried to start the geo-replication from master node and response back was
started successfully.
Status again .... Stopped
Tried to start again and get response started successfully, after that all
glusterd crashed on all master nodes.
After a restart of all glusterd the master cluster was up again.
Status for geo-replication is still Stopped and every try to start it after this
gives the response successful but still status Stopped.
Please help me get the geo-replication up and running again.
Best regards
Marcus Peders?n
Part of geo-replication log from master node:
[2018-07-11 18:42:48.941760] I [changelogagent(/urd-gds/gluster):73:__init__]
ChangelogAgent: Agent listining...
[2018-07-11 18:42:48.947567] I [resource(/urd-gds/gluster):1780:connect_remote]
SSH: Initializing SSH connection between master and slave...
[2018-07-11 18:42:49.363514] E
[syncdutils(/urd-gds/gluster):304:log_raise_exception] <top>: connection
to peer is broken
[2018-07-11 18:42:49.364279] E [resource(/urd-gds/gluster):210:errlog] Popen:
command returned error cmd=ssh -oPasswordAuthentication=no
-oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret\
.pem -p 22 -oControlMaster=auto -S
/tmp/gsyncd-aux-ssh-hjRhBo/7e5534547f3675a710a107722317484f.sock geouser at
urd-gds-geo-000 /nonexistent/gsyncd --session-owner
5e94eb7d-219f-4741-a179-d4ae6b50c7ee --local-id .%\
2Furd-gds%2Fgluster --local-node urd-gds-001 -N --listen --timeout 120
gluster://localhost:urd-gds-volume error=2
[2018-07-11 18:42:49.364586] E [resource(/urd-gds/gluster):214:logerr] Popen:
ssh> usage: gsyncd.py [-h]
[2018-07-11 18:42:49.364799] E [resource(/urd-gds/gluster):214:logerr] Popen:
ssh>
[2018-07-11 18:42:49.364989] E [resource(/urd-gds/gluster):214:logerr] Popen:
ssh>
{monitor-status,monitor,worker,agent,slave,status,config-check,config-get,config-set,config-reset,voluuidget,d\
elete}
[2018-07-11 18:42:49.365210] E [resource(/urd-gds/gluster):214:logerr] Popen:
ssh> ...
[2018-07-11 18:42:49.365408] E [resource(/urd-gds/gluster):214:logerr] Popen:
ssh> gsyncd.py: error: argument subcmd: invalid choice:
'5e94eb7d-219f-4741-a179-d4ae6b50c7ee' (choose from
'monitor-status', 'monit\
or', 'worker', 'agent', 'slave', 'status',
'config-check', 'config-get', 'config-set',
'config-reset', 'voluuidget', 'delete')
[2018-07-11 18:42:49.365919] I [syncdutils(/urd-gds/gluster):271:finalize]
<top>: exiting.
[2018-07-11 18:42:49.369316] I [repce(/urd-gds/gluster):92:service_loop]
RepceServer: terminating on reaching EOF.
[2018-07-11 18:42:49.369921] I [syncdutils(/urd-gds/gluster):271:finalize]
<top>: exiting.
[2018-07-11 18:42:49.369694] I [monitor(monitor):353:monitor] Monitor: worker
died before establishing connection brick=/urd-gds/gluster
[2018-07-11 18:42:59.492762] I [monitor(monitor):280:monitor] Monitor: starting
gsyncd worker brick=/urd-gds/gluster slave_node=ssh://geouser at
urd-gds-geo-000:gluster://localhost:urd-gds-volume
[2018-07-11 18:42:59.558491] I [resource(/urd-gds/gluster):1780:connect_remote]
SSH: Initializing SSH connection between master and slave...
[2018-07-11 18:42:59.559056] I [changelogagent(/urd-gds/gluster):73:__init__]
ChangelogAgent: Agent listining...
[2018-07-11 18:42:59.945693] E
[syncdutils(/urd-gds/gluster):304:log_raise_exception] <top>: connection
to peer is broken
[2018-07-11 18:42:59.946439] E [resource(/urd-gds/gluster):210:errlog] Popen:
command returned error cmd=ssh -oPasswordAuthentication=no
-oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret\
.pem -p 22 -oControlMaster=auto -S
/tmp/gsyncd-aux-ssh-992bk7/7e5534547f3675a710a107722317484f.sock geouser at
urd-gds-geo-000 /nonexistent/gsyncd --session-owner
5e94eb7d-219f-4741-a179-d4ae6b50c7ee --local-id .%\
2Furd-gds%2Fgluster --local-node urd-gds-001 -N --listen --timeout 120
gluster://localhost:urd-gds-volume error=2
[2018-07-11 18:42:59.946748] E [resource(/urd-gds/gluster):214:logerr] Popen:
ssh> usage: gsyncd.py [-h]
[2018-07-11 18:42:59.946962] E [resource(/urd-gds/gluster):214:logerr] Popen:
ssh>
[2018-07-11 18:42:59.947150] E [resource(/urd-gds/gluster):214:logerr] Popen:
ssh>
{monitor-status,monitor,worker,agent,slave,status,config-check,config-get,config-set,config-reset,voluuidget,d\
elete}
[2018-07-11 18:42:59.947369] E [resource(/urd-gds/gluster):214:logerr] Popen:
ssh> ...
[2018-07-11 18:42:59.947552] E [resource(/urd-gds/gluster):214:logerr] Popen:
ssh> gsyncd.py: error: argument subcmd: invalid choice:
'5e94eb7d-219f-4741-a179-d4ae6b50c7ee' (choose from
'monitor-status', 'monit\
or', 'worker', 'agent', 'slave', 'status',
'config-check', 'config-get', 'config-set',
'config-reset', 'voluuidget', 'delete')
[2018-07-11 18:42:59.948046] I [syncdutils(/urd-gds/gluster):271:finalize]
<top>: exiting.
[2018-07-11 18:42:59.951392] I [repce(/urd-gds/gluster):92:service_loop]
RepceServer: terminating on reaching EOF.
[2018-07-11 18:42:59.951760] I [syncdutils(/urd-gds/gluster):271:finalize]
<top>: exiting.
[2018-07-11 18:42:59.951817] I [monitor(monitor):353:monitor] Monitor: worker
died before establishing connection brick=/urd-gds/gluster
[2018-07-11 18:43:10.54580] I [monitor(monitor):280:monitor] Monitor: starting
gsyncd worker brick=/urd-gds/gluster slave_node=ssh://geouser at
urd-gds-geo-000:gluster://localhost:urd-gds-volume
[2018-07-11 18:43:10.88356] I [monitor(monitor):345:monitor] Monitor: Changelog
Agent died, Aborting Worker brick=/urd-gds/gluster
[2018-07-11 18:43:10.88613] I [monitor(monitor):353:monitor] Monitor: worker
died before establishing connection brick=/urd-gds/gluster
[2018-07-11 18:43:20.112435] I [gsyncdstatus(monitor):242:set_worker_status]
GeorepStatus: Worker Status Change status=inconsistent
[2018-07-11 18:43:20.112885] E [syncdutils(monitor):331:log_raise_exception]
<top>: FAIL:
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line
361, in twrap
except:
File "/usr/libexec/glusterfs/python/syncdaemon/monitor.py", line
428, in wmon
sys.exit()
TypeError: 'int' object is not iterable
[2018-07-11 18:43:20.114610] I [syncdutils(monitor):271:finalize] <top>:
exiting.
---
N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
<https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
E-mailing SLU will result in SLU processing your personal data. For more
information on how this is done, click here
<https://www.slu.se/en/about-slu/contact-slu/personal-data/>
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org<mailto:Gluster-users at gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-users
--
Thanks and Regards,
Kotresh H R
---
N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
<https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
E-mailing SLU will result in SLU processing your personal data. For more
information on how this is done, click here
<https://www.slu.se/en/about-slu/contact-slu/personal-data/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180712/07006ff6/attachment.html>
Marcus Pedersén
2018-Jul-12 19:56 UTC
[Gluster-users] Upgrade to 4.1.1 geo-replication does not work
Hi Kotresh,
i have replaced both files
(gsyncdconfig.py<https://review.gluster.org/#/c/20207/1/geo-replication/syncdaemon/gsyncdconfig.py>
and
repce.py<https://review.gluster.org/#/c/20207/1/geo-replication/syncdaemon/repce.py>)
in all nodes both master and slave.
I rebooted all servers but geo-replication status is still Stopped.
I tried to start geo-replication with response Successful but status still show
Stopped on all nodes.
Nothing has been written to geo-replication logs since I sent the tail of the
log.
So I do not know what info to provide?
Please, help me to find a way to solve this.
Thanks!
Regards
Marcus
________________________________
Fr?n: gluster-users-bounces at gluster.org <gluster-users-bounces at
gluster.org> f?r Marcus Peders?n <marcus.pedersen at slu.se>
Skickat: den 12 juli 2018 08:51
Till: Kotresh Hiremath Ravishankar
Kopia: gluster-users at gluster.org
?mne: Re: [Gluster-users] Upgrade to 4.1.1 geo-replication does not work
Thanks Kotresh,
I installed through the official centos channel, centos-release-gluster41.
Isn't this fix included in centos install?
I will have a look, test it tonight and come back to you!
Thanks a lot!
Regards
Marcus
################
Marcus Peders?n
Systemadministrator
Interbull Centre
################
Sent from my phone
################
Den 12 juli 2018 07:41 skrev Kotresh Hiremath Ravishankar <khiremat at
redhat.com>:
Hi Marcus,
I think the fix [1] is needed in 4.1
Could you please this out and let us know if that works for you?
[1] https://review.gluster.org/#/c/20207/
Thanks,
Kotresh HR
On Thu, Jul 12, 2018 at 1:49 AM, Marcus Peders?n <marcus.pedersen at
slu.se<mailto:marcus.pedersen at slu.se>> wrote:
Hi all,
I have upgraded from 3.12.9 to 4.1.1 and been following upgrade instructions for
offline upgrade.
I upgraded geo-replication side first 1 x (2+1) and the master side after that 2
x (2+1).
Both clusters works the way they should on their own.
After upgrade on master side status for all geo-replication nodes is Stopped.
I tried to start the geo-replication from master node and response back was
started successfully.
Status again .... Stopped
Tried to start again and get response started successfully, after that all
glusterd crashed on all master nodes.
After a restart of all glusterd the master cluster was up again.
Status for geo-replication is still Stopped and every try to start it after this
gives the response successful but still status Stopped.
Please help me get the geo-replication up and running again.
Best regards
Marcus Peders?n
Part of geo-replication log from master node:
[2018-07-11 18:42:48.941760] I [changelogagent(/urd-gds/gluster):73:__init__]
ChangelogAgent: Agent listining...
[2018-07-11 18:42:48.947567] I [resource(/urd-gds/gluster):1780:connect_remote]
SSH: Initializing SSH connection between master and slave...
[2018-07-11 18:42:49.363514] E
[syncdutils(/urd-gds/gluster):304:log_raise_exception] <top>: connection
to peer is broken
[2018-07-11 18:42:49.364279] E [resource(/urd-gds/gluster):210:errlog] Popen:
command returned error cmd=ssh -oPasswordAuthentication=no
-oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret\
.pem -p 22 -oControlMaster=auto -S
/tmp/gsyncd-aux-ssh-hjRhBo/7e5534547f3675a710a107722317484f.sock geouser at
urd-gds-geo-000 /nonexistent/gsyncd --session-owner
5e94eb7d-219f-4741-a179-d4ae6b50c7ee --local-id .%\
2Furd-gds%2Fgluster --local-node urd-gds-001 -N --listen --timeout 120
gluster://localhost:urd-gds-volume error=2
[2018-07-11 18:42:49.364586] E [resource(/urd-gds/gluster):214:logerr] Popen:
ssh> usage: gsyncd.py [-h]
[2018-07-11 18:42:49.364799] E [resource(/urd-gds/gluster):214:logerr] Popen:
ssh>
[2018-07-11 18:42:49.364989] E [resource(/urd-gds/gluster):214:logerr] Popen:
ssh>
{monitor-status,monitor,worker,agent,slave,status,config-check,config-get,config-set,config-reset,voluuidget,d\
elete}
[2018-07-11 18:42:49.365210] E [resource(/urd-gds/gluster):214:logerr] Popen:
ssh> ...
[2018-07-11 18:42:49.365408] E [resource(/urd-gds/gluster):214:logerr] Popen:
ssh> gsyncd.py: error: argument subcmd: invalid choice:
'5e94eb7d-219f-4741-a179-d4ae6b50c7ee' (choose from
'monitor-status', 'monit\
or', 'worker', 'agent', 'slave', 'status',
'config-check', 'config-get', 'config-set',
'config-reset', 'voluuidget', 'delete')
[2018-07-11 18:42:49.365919] I [syncdutils(/urd-gds/gluster):271:finalize]
<top>: exiting.
[2018-07-11 18:42:49.369316] I [repce(/urd-gds/gluster):92:service_loop]
RepceServer: terminating on reaching EOF.
[2018-07-11 18:42:49.369921] I [syncdutils(/urd-gds/gluster):271:finalize]
<top>: exiting.
[2018-07-11 18:42:49.369694] I [monitor(monitor):353:monitor] Monitor: worker
died before establishing connection brick=/urd-gds/gluster
[2018-07-11 18:42:59.492762] I [monitor(monitor):280:monitor] Monitor: starting
gsyncd worker brick=/urd-gds/gluster slave_node=ssh://geouser at
urd-gds-geo-000:gluster://localhost:urd-gds-volume
[2018-07-11 18:42:59.558491] I [resource(/urd-gds/gluster):1780:connect_remote]
SSH: Initializing SSH connection between master and slave...
[2018-07-11 18:42:59.559056] I [changelogagent(/urd-gds/gluster):73:__init__]
ChangelogAgent: Agent listining...
[2018-07-11 18:42:59.945693] E
[syncdutils(/urd-gds/gluster):304:log_raise_exception] <top>: connection
to peer is broken
[2018-07-11 18:42:59.946439] E [resource(/urd-gds/gluster):210:errlog] Popen:
command returned error cmd=ssh -oPasswordAuthentication=no
-oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret\
.pem -p 22 -oControlMaster=auto -S
/tmp/gsyncd-aux-ssh-992bk7/7e5534547f3675a710a107722317484f.sock geouser at
urd-gds-geo-000 /nonexistent/gsyncd --session-owner
5e94eb7d-219f-4741-a179-d4ae6b50c7ee --local-id .%\
2Furd-gds%2Fgluster --local-node urd-gds-001 -N --listen --timeout 120
gluster://localhost:urd-gds-volume error=2
[2018-07-11 18:42:59.946748] E [resource(/urd-gds/gluster):214:logerr] Popen:
ssh> usage: gsyncd.py [-h]
[2018-07-11 18:42:59.946962] E [resource(/urd-gds/gluster):214:logerr] Popen:
ssh>
[2018-07-11 18:42:59.947150] E [resource(/urd-gds/gluster):214:logerr] Popen:
ssh>
{monitor-status,monitor,worker,agent,slave,status,config-check,config-get,config-set,config-reset,voluuidget,d\
elete}
[2018-07-11 18:42:59.947369] E [resource(/urd-gds/gluster):214:logerr] Popen:
ssh> ...
[2018-07-11 18:42:59.947552] E [resource(/urd-gds/gluster):214:logerr] Popen:
ssh> gsyncd.py: error: argument subcmd: invalid choice:
'5e94eb7d-219f-4741-a179-d4ae6b50c7ee' (choose from
'monitor-status', 'monit\
or', 'worker', 'agent', 'slave', 'status',
'config-check', 'config-get', 'config-set',
'config-reset', 'voluuidget', 'delete')
[2018-07-11 18:42:59.948046] I [syncdutils(/urd-gds/gluster):271:finalize]
<top>: exiting.
[2018-07-11 18:42:59.951392] I [repce(/urd-gds/gluster):92:service_loop]
RepceServer: terminating on reaching EOF.
[2018-07-11 18:42:59.951760] I [syncdutils(/urd-gds/gluster):271:finalize]
<top>: exiting.
[2018-07-11 18:42:59.951817] I [monitor(monitor):353:monitor] Monitor: worker
died before establishing connection brick=/urd-gds/gluster
[2018-07-11 18:43:10.54580] I [monitor(monitor):280:monitor] Monitor: starting
gsyncd worker brick=/urd-gds/gluster slave_node=ssh://geouser at
urd-gds-geo-000:gluster://localhost:urd-gds-volume
[2018-07-11 18:43:10.88356] I [monitor(monitor):345:monitor] Monitor: Changelog
Agent died, Aborting Worker brick=/urd-gds/gluster
[2018-07-11 18:43:10.88613] I [monitor(monitor):353:monitor] Monitor: worker
died before establishing connection brick=/urd-gds/gluster
[2018-07-11 18:43:20.112435] I [gsyncdstatus(monitor):242:set_worker_status]
GeorepStatus: Worker Status Change status=inconsistent
[2018-07-11 18:43:20.112885] E [syncdutils(monitor):331:log_raise_exception]
<top>: FAIL:
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line
361, in twrap
except:
File "/usr/libexec/glusterfs/python/syncdaemon/monitor.py", line
428, in wmon
sys.exit()
TypeError: 'int' object is not iterable
[2018-07-11 18:43:20.114610] I [syncdutils(monitor):271:finalize] <top>:
exiting.
---
N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
<https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
E-mailing SLU will result in SLU processing your personal data. For more
information on how this is done, click here
<https://www.slu.se/en/about-slu/contact-slu/personal-data/>
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org<mailto:Gluster-users at gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-users
--
Thanks and Regards,
Kotresh H R
---
N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
<https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
E-mailing SLU will result in SLU processing your personal data. For more
information on how this is done, click here
<https://www.slu.se/en/about-slu/contact-slu/personal-data/>
---
N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina
personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r
<https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
E-mailing SLU will result in SLU processing your personal data. For more
information on how this is done, click here
<https://www.slu.se/en/about-slu/contact-slu/personal-data/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180712/37205190/attachment.html>