Marcus Pedersén
2018-Jul-11 20:19 UTC
[Gluster-users] Upgrade to 4.1.1 geo-replication does not work
Hi all, I have upgraded from 3.12.9 to 4.1.1 and been following upgrade instructions for offline upgrade. I upgraded geo-replication side first 1 x (2+1) and the master side after that 2 x (2+1). Both clusters works the way they should on their own. After upgrade on master side status for all geo-replication nodes is Stopped. I tried to start the geo-replication from master node and response back was started successfully. Status again .... Stopped Tried to start again and get response started successfully, after that all glusterd crashed on all master nodes. After a restart of all glusterd the master cluster was up again. Status for geo-replication is still Stopped and every try to start it after this gives the response successful but still status Stopped. Please help me get the geo-replication up and running again. Best regards Marcus Peders?n Part of geo-replication log from master node: [2018-07-11 18:42:48.941760] I [changelogagent(/urd-gds/gluster):73:__init__] ChangelogAgent: Agent listining... [2018-07-11 18:42:48.947567] I [resource(/urd-gds/gluster):1780:connect_remote] SSH: Initializing SSH connection between master and slave... [2018-07-11 18:42:49.363514] E [syncdutils(/urd-gds/gluster):304:log_raise_exception] <top>: connection to peer is broken [2018-07-11 18:42:49.364279] E [resource(/urd-gds/gluster):210:errlog] Popen: command returned error cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret\ .pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-hjRhBo/7e5534547f3675a710a107722317484f.sock geouser at urd-gds-geo-000 /nonexistent/gsyncd --session-owner 5e94eb7d-219f-4741-a179-d4ae6b50c7ee --local-id .%\ 2Furd-gds%2Fgluster --local-node urd-gds-001 -N --listen --timeout 120 gluster://localhost:urd-gds-volume error=2 [2018-07-11 18:42:49.364586] E [resource(/urd-gds/gluster):214:logerr] Popen: ssh> usage: gsyncd.py [-h] [2018-07-11 18:42:49.364799] E [resource(/urd-gds/gluster):214:logerr] Popen: ssh> [2018-07-11 18:42:49.364989] E [resource(/urd-gds/gluster):214:logerr] Popen: ssh> {monitor-status,monitor,worker,agent,slave,status,config-check,config-get,config-set,config-reset,voluuidget,d\ elete} [2018-07-11 18:42:49.365210] E [resource(/urd-gds/gluster):214:logerr] Popen: ssh> ... [2018-07-11 18:42:49.365408] E [resource(/urd-gds/gluster):214:logerr] Popen: ssh> gsyncd.py: error: argument subcmd: invalid choice: '5e94eb7d-219f-4741-a179-d4ae6b50c7ee' (choose from 'monitor-status', 'monit\ or', 'worker', 'agent', 'slave', 'status', 'config-check', 'config-get', 'config-set', 'config-reset', 'voluuidget', 'delete') [2018-07-11 18:42:49.365919] I [syncdutils(/urd-gds/gluster):271:finalize] <top>: exiting. [2018-07-11 18:42:49.369316] I [repce(/urd-gds/gluster):92:service_loop] RepceServer: terminating on reaching EOF. [2018-07-11 18:42:49.369921] I [syncdutils(/urd-gds/gluster):271:finalize] <top>: exiting. [2018-07-11 18:42:49.369694] I [monitor(monitor):353:monitor] Monitor: worker died before establishing connection brick=/urd-gds/gluster [2018-07-11 18:42:59.492762] I [monitor(monitor):280:monitor] Monitor: starting gsyncd worker brick=/urd-gds/gluster slave_node=ssh://geouser at urd-gds-geo-000:gluster://localhost:urd-gds-volume [2018-07-11 18:42:59.558491] I [resource(/urd-gds/gluster):1780:connect_remote] SSH: Initializing SSH connection between master and slave... [2018-07-11 18:42:59.559056] I [changelogagent(/urd-gds/gluster):73:__init__] ChangelogAgent: Agent listining... [2018-07-11 18:42:59.945693] E [syncdutils(/urd-gds/gluster):304:log_raise_exception] <top>: connection to peer is broken [2018-07-11 18:42:59.946439] E [resource(/urd-gds/gluster):210:errlog] Popen: command returned error cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret\ .pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-992bk7/7e5534547f3675a710a107722317484f.sock geouser at urd-gds-geo-000 /nonexistent/gsyncd --session-owner 5e94eb7d-219f-4741-a179-d4ae6b50c7ee --local-id .%\ 2Furd-gds%2Fgluster --local-node urd-gds-001 -N --listen --timeout 120 gluster://localhost:urd-gds-volume error=2 [2018-07-11 18:42:59.946748] E [resource(/urd-gds/gluster):214:logerr] Popen: ssh> usage: gsyncd.py [-h] [2018-07-11 18:42:59.946962] E [resource(/urd-gds/gluster):214:logerr] Popen: ssh> [2018-07-11 18:42:59.947150] E [resource(/urd-gds/gluster):214:logerr] Popen: ssh> {monitor-status,monitor,worker,agent,slave,status,config-check,config-get,config-set,config-reset,voluuidget,d\ elete} [2018-07-11 18:42:59.947369] E [resource(/urd-gds/gluster):214:logerr] Popen: ssh> ... [2018-07-11 18:42:59.947552] E [resource(/urd-gds/gluster):214:logerr] Popen: ssh> gsyncd.py: error: argument subcmd: invalid choice: '5e94eb7d-219f-4741-a179-d4ae6b50c7ee' (choose from 'monitor-status', 'monit\ or', 'worker', 'agent', 'slave', 'status', 'config-check', 'config-get', 'config-set', 'config-reset', 'voluuidget', 'delete') [2018-07-11 18:42:59.948046] I [syncdutils(/urd-gds/gluster):271:finalize] <top>: exiting. [2018-07-11 18:42:59.951392] I [repce(/urd-gds/gluster):92:service_loop] RepceServer: terminating on reaching EOF. [2018-07-11 18:42:59.951760] I [syncdutils(/urd-gds/gluster):271:finalize] <top>: exiting. [2018-07-11 18:42:59.951817] I [monitor(monitor):353:monitor] Monitor: worker died before establishing connection brick=/urd-gds/gluster [2018-07-11 18:43:10.54580] I [monitor(monitor):280:monitor] Monitor: starting gsyncd worker brick=/urd-gds/gluster slave_node=ssh://geouser at urd-gds-geo-000:gluster://localhost:urd-gds-volume [2018-07-11 18:43:10.88356] I [monitor(monitor):345:monitor] Monitor: Changelog Agent died, Aborting Worker brick=/urd-gds/gluster [2018-07-11 18:43:10.88613] I [monitor(monitor):353:monitor] Monitor: worker died before establishing connection brick=/urd-gds/gluster [2018-07-11 18:43:20.112435] I [gsyncdstatus(monitor):242:set_worker_status] GeorepStatus: Worker Status Change status=inconsistent [2018-07-11 18:43:20.112885] E [syncdutils(monitor):331:log_raise_exception] <top>: FAIL: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 361, in twrap except: File "/usr/libexec/glusterfs/python/syncdaemon/monitor.py", line 428, in wmon sys.exit() TypeError: 'int' object is not iterable [2018-07-11 18:43:20.114610] I [syncdutils(monitor):271:finalize] <top>: exiting. --- N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/> E-mailing SLU will result in SLU processing your personal data. For more information on how this is done, click here <https://www.slu.se/en/about-slu/contact-slu/personal-data/> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180711/6da4240a/attachment.html>
Kotresh Hiremath Ravishankar
2018-Jul-12 05:41 UTC
[Gluster-users] Upgrade to 4.1.1 geo-replication does not work
Hi Marcus, I think the fix [1] is needed in 4.1 Could you please this out and let us know if that works for you? [1] https://review.gluster.org/#/c/20207/ Thanks, Kotresh HR On Thu, Jul 12, 2018 at 1:49 AM, Marcus Peders?n <marcus.pedersen at slu.se> wrote:> Hi all, > > I have upgraded from 3.12.9 to 4.1.1 and been following upgrade > instructions for offline upgrade. > > I upgraded geo-replication side first 1 x (2+1) and the master side after > that 2 x (2+1). > > Both clusters works the way they should on their own. > > After upgrade on master side status for all geo-replication nodes > is Stopped. > > I tried to start the geo-replication from master node and response back > was started successfully. > > Status again .... Stopped > > Tried to start again and get response started successfully, after that all > glusterd crashed on all master nodes. > > After a restart of all glusterd the master cluster was up again. > > Status for geo-replication is still Stopped and every try to start it > after this gives the response successful but still status Stopped. > > > Please help me get the geo-replication up and running again. > > > Best regards > > Marcus Peders?n > > > Part of geo-replication log from master node: > > [2018-07-11 18:42:48.941760] I [changelogagent(/urd-gds/gluster):73:__init__] > ChangelogAgent: Agent listining... > [2018-07-11 18:42:48.947567] I [resource(/urd-gds/gluster):1780:connect_remote] > SSH: Initializing SSH connection between master and slave... > [2018-07-11 18:42:49.363514] E [syncdutils(/urd-gds/gluster):304:log_raise_exception] > <top>: connection to peer is broken > [2018-07-11 18:42:49.364279] E [resource(/urd-gds/gluster):210:errlog] > Popen: command returned error cmd=ssh -oPasswordAuthentication=no > -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret\ > .pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-hjRhBo/ > 7e5534547f3675a710a107722317484f.sock geouser at urd-gds-geo-000 > /nonexistent/gsyncd --session-owner 5e94eb7d-219f-4741-a179-d4ae6b50c7ee > --local-id .%\ > 2Furd-gds%2Fgluster --local-node urd-gds-001 -N --listen --timeout 120 > gluster://localhost:urd-gds-volume error=2 > [2018-07-11 18:42:49.364586] E [resource(/urd-gds/gluster):214:logerr] > Popen: ssh> usage: gsyncd.py [-h] > [2018-07-11 18:42:49.364799] E [resource(/urd-gds/gluster):214:logerr] > Popen: ssh> > [2018-07-11 18:42:49.364989] E [resource(/urd-gds/gluster):214:logerr] > Popen: ssh> {monitor-status,monitor, > worker,agent,slave,status,config-check,config-get,config-set,config-reset, > voluuidget,d\ > elete} > [2018-07-11 18:42:49.365210] E [resource(/urd-gds/gluster):214:logerr] > Popen: ssh> ... > [2018-07-11 18:42:49.365408] E [resource(/urd-gds/gluster):214:logerr] > Popen: ssh> gsyncd.py: error: argument subcmd: invalid choice: > '5e94eb7d-219f-4741-a179-d4ae6b50c7ee' (choose from 'monitor-status', > 'monit\ > or', 'worker', 'agent', 'slave', 'status', 'config-check', 'config-get', > 'config-set', 'config-reset', 'voluuidget', 'delete') > [2018-07-11 18:42:49.365919] I [syncdutils(/urd-gds/gluster):271:finalize] > <top>: exiting. > [2018-07-11 18:42:49.369316] I [repce(/urd-gds/gluster):92:service_loop] > RepceServer: terminating on reaching EOF. > [2018-07-11 18:42:49.369921] I [syncdutils(/urd-gds/gluster):271:finalize] > <top>: exiting. > [2018-07-11 18:42:49.369694] I [monitor(monitor):353:monitor] Monitor: > worker died before establishing connection brick=/urd-gds/gluster > [2018-07-11 18:42:59.492762] I [monitor(monitor):280:monitor] Monitor: > starting gsyncd worker brick=/urd-gds/gluster > slave_node=ssh://geouser at urd-gds-geo-000:gluster:// > localhost:urd-gds-volume > [2018-07-11 18:42:59.558491] I [resource(/urd-gds/gluster):1780:connect_remote] > SSH: Initializing SSH connection between master and slave... > [2018-07-11 18:42:59.559056] I [changelogagent(/urd-gds/gluster):73:__init__] > ChangelogAgent: Agent listining... > [2018-07-11 18:42:59.945693] E [syncdutils(/urd-gds/gluster):304:log_raise_exception] > <top>: connection to peer is broken > [2018-07-11 18:42:59.946439] E [resource(/urd-gds/gluster):210:errlog] > Popen: command returned error cmd=ssh -oPasswordAuthentication=no > -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret\ > .pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-992bk7/ > 7e5534547f3675a710a107722317484f.sock geouser at urd-gds-geo-000 > /nonexistent/gsyncd --session-owner 5e94eb7d-219f-4741-a179-d4ae6b50c7ee > --local-id .%\ > 2Furd-gds%2Fgluster --local-node urd-gds-001 -N --listen --timeout 120 > gluster://localhost:urd-gds-volume error=2 > [2018-07-11 18:42:59.946748] E [resource(/urd-gds/gluster):214:logerr] > Popen: ssh> usage: gsyncd.py [-h] > [2018-07-11 18:42:59.946962] E [resource(/urd-gds/gluster):214:logerr] > Popen: ssh> > [2018-07-11 18:42:59.947150] E [resource(/urd-gds/gluster):214:logerr] > Popen: ssh> {monitor-status,monitor, > worker,agent,slave,status,config-check,config-get,config-set,config-reset, > voluuidget,d\ > elete} > [2018-07-11 18:42:59.947369] E [resource(/urd-gds/gluster):214:logerr] > Popen: ssh> ... > [2018-07-11 18:42:59.947552] E [resource(/urd-gds/gluster):214:logerr] > Popen: ssh> gsyncd.py: error: argument subcmd: invalid choice: > '5e94eb7d-219f-4741-a179-d4ae6b50c7ee' (choose from 'monitor-status', > 'monit\ > or', 'worker', 'agent', 'slave', 'status', 'config-check', 'config-get', > 'config-set', 'config-reset', 'voluuidget', 'delete') > [2018-07-11 18:42:59.948046] I [syncdutils(/urd-gds/gluster):271:finalize] > <top>: exiting. > [2018-07-11 18:42:59.951392] I [repce(/urd-gds/gluster):92:service_loop] > RepceServer: terminating on reaching EOF. > [2018-07-11 18:42:59.951760] I [syncdutils(/urd-gds/gluster):271:finalize] > <top>: exiting. > [2018-07-11 18:42:59.951817] I [monitor(monitor):353:monitor] Monitor: > worker died before establishing connection brick=/urd-gds/gluster > [2018-07-11 18:43:10.54580] I [monitor(monitor):280:monitor] Monitor: > starting gsyncd worker brick=/urd-gds/gluster > slave_node=ssh://geouser at urd-gds-geo-000:gluster:// > localhost:urd-gds-volume > [2018-07-11 18:43:10.88356] I [monitor(monitor):345:monitor] Monitor: > Changelog Agent died, Aborting Worker brick=/urd-gds/gluster > [2018-07-11 18:43:10.88613] I [monitor(monitor):353:monitor] Monitor: > worker died before establishing connection brick=/urd-gds/gluster > [2018-07-11 18:43:20.112435] I [gsyncdstatus(monitor):242:set_worker_status] > GeorepStatus: Worker Status Change status=inconsistent > [2018-07-11 18:43:20.112885] E [syncdutils(monitor):331:log_raise_exception] > <top>: FAIL: > Traceback (most recent call last): > File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line > 361, in twrap > except: > File "/usr/libexec/glusterfs/python/syncdaemon/monitor.py", line 428, > in wmon > sys.exit() > TypeError: 'int' object is not iterable > [2018-07-11 18:43:20.114610] I [syncdutils(monitor):271:finalize] <top>: > exiting. > > --- > N?r du skickar e-post till SLU s? inneb?r detta att SLU behandlar dina > personuppgifter. F?r att l?sa mer om hur detta g?r till, klicka h?r > <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/> > E-mailing SLU will result in SLU processing your personal data. For more > information on how this is done, click here > <https://www.slu.se/en/about-slu/contact-slu/personal-data/> > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users >-- Thanks and Regards, Kotresh H R -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180712/6078afc3/attachment.html>