Jeremey Wise
2021-Jan-07 14:48 UTC
[CentOS-virt] HCI Cluster - CentOS8 to Streams Upgrade Broken
I have a test environment. Three node HCI cluster. CentOS8 build. Gluster as file system with standard cockpit deploy of HCI. Converted to CentOS Streams which seemed to go fine. Did a yum update and no issues. Did a reboot.. and now engine will no longer start. So I can no longer start my Virtual machines. I posted as bug https://bugzilla.redhat.com/show_bug.cgi?id=1911910 I posted to CentOS forum https://forums.centos.org/viewtopic.php?f=54&t=76716 but no responses. Can anyone provide means or next step to root cause and or fix? It would take me days to rebuild the entire solution and I really hate to "reload" as a fix.. but after two weeks... nothing changing... just have to find some means to get cluster back working. Thanks. ##### all three servers are looping below events in /var/log/messages ### Jan 7 09:48:03 thor journal[2375050]: ovirt-ha-broker ovirt_hosted_engine_ha.broker.broker.Broker ERROR Failed initializing the broker: [Errno 107] Transport endpoint is not connected: '/rhev/data-center/mnt/glusterSD/thorst.penguinpages.local:_engine/3afc47ba-afb9-413f-8de5-8d9a2f45ecde/ha_agent/hosted-engine.metadata' Jan 7 09:48:03 thor journal[2375050]: ovirt-ha-broker ovirt_hosted_engine_ha.broker.broker.Broker ERROR Traceback (most recent call last):#012 File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", line 64, in run#012 self._storage_broker_instance self._get_storage_broker()#012 File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", line 143, in _get_storage_broker#012 return storage_broker.StorageBroker()#012 File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 97, in __init__#012 self._backend.connect()#012 File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 408, in connect#012 self._check_symlinks(self._storage_path, volume.path, service_link)#012 File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 105, in _check_symlinks#012 os.unlink(service_link)#012OSError: [Errno 107] Transport endpoint is not connected: '/rhev/data-center/mnt/glusterSD/thorst.penguinpages.local:_engine/3afc47ba-afb9-413f-8de5-8d9a2f45ecde/ha_agent/hosted-engine.metadata' Jan 7 09:48:03 thor journal[2375050]: ovirt-ha-broker ovirt_hosted_engine_ha.broker.broker.Broker ERROR Trying to restart the broker Jan 7 09:48:03 thor platform-python[2375050]: detected unhandled Python exception in '/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker' Jan 7 09:48:03 thor abrt-server[2375084]: Not saving repeating crash in '/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker' Jan 7 09:48:03 thor systemd[1]: ovirt-ha-broker.service: Main process exited, code=exited, status=1/FAILURE Jan 7 09:48:03 thor systemd[1]: ovirt-ha-broker.service: Failed with result 'exit-code'. Jan 7 09:48:04 thor systemd[1]: ovirt-ha-broker.service: Service RestartSec=100ms expired, scheduling restart. Jan 7 09:48:04 thor systemd[1]: ovirt-ha-broker.service: Scheduled restart job, restart counter is at 44569. Jan 7 09:48:04 thor systemd[1]: Stopped oVirt Hosted Engine High Availability Communications Broker. Jan 7 09:48:04 thor systemd[1]: Started oVirt Hosted Engine High Availability Communications Broker. Jan 7 09:48:06 thor systemd[1]: ovirt-ha-agent.service: Service RestartSec=10s expired, scheduling restart. Jan 7 09:48:06 thor systemd[1]: ovirt-ha-agent.service: Scheduled restart job, restart counter is at 22270. Jan 7 09:48:06 thor systemd[1]: Stopped oVirt Hosted Engine High Availability Monitoring Agent. Jan 7 09:48:06 thor systemd[1]: Started oVirt Hosted Engine High Availability Monitoring Agent. Jan 7 09:48:06 thor systemd[1]: Started Session c44598 of user root. Jan 7 09:48:06 thor systemd[1]: session-c44598.scope: Succeeded. Jan 7 09:48:06 thor journal[2375091]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to start necessary monitors Jan 7 09:48:06 thor journal[2375091]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent call last):#012 File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 85, in start_monitor#012 response = self._proxy.start_monitor(type, options)#012 File "/usr/lib64/python3.6/xmlrpc/client.py", line 1112, in __call__#012 return self.__send(self.__name, args)#012 File "/usr/lib64/python3.6/xmlrpc/client.py", line 1452, in __request#012 verbose=self.__verbose#012 File "/usr/lib64/python3.6/xmlrpc/client.py", line 1154, in request#012 return self.single_request(host, handler, request_body, verbose)#012 File "/usr/lib64/python3.6/xmlrpc/client.py", line 1166, in single_request#012 http_conn = self.send_request(host, handler, request_body, verbose)#012 File "/usr/lib64/python3.6/xmlrpc/client.py", line 1279, in send_request#012 self.send_content(connection, request_body)#012 File "/usr/lib64/python3.6/xmlrpc/client.py", line 1309, in send_content#012 connection.endheaders(request_body)#012 File "/usr/lib64/python3.6/http/client.py", line 1264, in endheaders#012 self._send_output(message_body, encode_chunked=encode_chunked)#012 File "/usr/lib64/python3.6/http/client.py", line 1040, in _send_output#012 self.send(msg)#012 File "/usr/lib64/python3.6/http/client.py", line 978, in send#012 self.connect()#012 File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py", line 74, in connect#012 self.sock.connect(base64.b16decode(self.host))#012FileNotFoundError: [Errno 2] No such file or directory#012#012During handling of the above exception, another exception occurred:#012#012Traceback (most recent call last):#012 File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 131, in _run_agent#012 return action(he)#012 File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 55, in action_proper#012 return he.start_monitoring()#012 File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 437, in start_monitoring#012 self._initialize_broker()#012 File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 561, in _initialize_broker#012 m.get('options', {}))#012 File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 91, in start_monitor#012 ).format(t=type, o=options, e=e)#012ovirt_hosted_engine_ha.lib.exceptions.RequestError: brokerlink - failed to start monitor via ovirt-ha-broker: [Errno 2] No such file or directory, [monitor: 'network', options: {'addr': '172.16.100.1', 'network_test': 'dns', 'tcp_t_address': '', 'tcp_t_port': ''}] Jan 7 09:48:06 thor journal[2375091]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent Jan 7 09:48:06 thor systemd[1]: ovirt-ha-agent.service: Main process exited, code=exited, status=157/n/a Jan 7 09:48:06 thor systemd[1]: ovirt-ha-agent.service: Failed with result 'exit-code'. #### Is their a means to start a VM... when oVirt / engine is offline? -- penguinpages -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20210107/bb2feb7e/attachment-0005.html>
Sandro Bonazzola
2021-Jan-18 09:39 UTC
[CentOS-virt] HCI Cluster - CentOS8 to Streams Upgrade Broken
Il giorno gio 7 gen 2021 alle ore 15:58 Jeremey Wise <jeremey.wise at gmail.com> ha scritto:> > I have a test environment. Three node HCI cluster. CentOS8 build. > Gluster as file system with standard cockpit deploy of HCI. >Hi, I would recommend to reach users at ovirt.org mailing list for oVirt related issues.> > > Converted to CentOS Streams which seemed to go fine. Did a yum update and > no issues. > > Did a reboot.. and now engine will no longer start. So I can no longer > start my Virtual machines. I posted as bug > https://bugzilla.redhat.com/show_bug.cgi?id=1911910 I posted to CentOS > forum https://forums.centos.org/viewtopic.php?f=54&t=76716 but no > responses. > > Can anyone provide means or next step to root cause and or fix? >We pushed a fix which is in current nightly but please keep using CentOS Linux for oVirt till it will be officially announced its full compatibility with CentOS Stream. Currently it's a tech preview up to oVirt 4.4.4.> It would take me days to rebuild the entire solution and I really hate to > "reload" as a fix.. but after two weeks... nothing changing... just have to > find some means to get cluster back working. > > Thanks. > > ##### all three servers are looping below events in /var/log/messages ### > Jan 7 09:48:03 thor journal[2375050]: ovirt-ha-broker > ovirt_hosted_engine_ha.broker.broker.Broker ERROR Failed initializing the > broker: [Errno 107] Transport endpoint is not connected: > '/rhev/data-center/mnt/glusterSD/thorst.penguinpages.local:_engine/3afc47ba-afb9-413f-8de5-8d9a2f45ecde/ha_agent/hosted-engine.metadata' > Jan 7 09:48:03 thor journal[2375050]: ovirt-ha-broker > ovirt_hosted_engine_ha.broker.broker.Broker ERROR Traceback (most recent > call last):#012 File > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", > line 64, in run#012 self._storage_broker_instance > self._get_storage_broker()#012 File > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py", > line 143, in _get_storage_broker#012 return > storage_broker.StorageBroker()#012 File > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", > line 97, in __init__#012 self._backend.connect()#012 File > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", > line 408, in connect#012 self._check_symlinks(self._storage_path, > volume.path, service_link)#012 File > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", > line 105, in _check_symlinks#012 os.unlink(service_link)#012OSError: > [Errno 107] Transport endpoint is not connected: > '/rhev/data-center/mnt/glusterSD/thorst.penguinpages.local:_engine/3afc47ba-afb9-413f-8de5-8d9a2f45ecde/ha_agent/hosted-engine.metadata' > Jan 7 09:48:03 thor journal[2375050]: ovirt-ha-broker > ovirt_hosted_engine_ha.broker.broker.Broker ERROR Trying to restart the > broker > Jan 7 09:48:03 thor platform-python[2375050]: detected unhandled Python > exception in '/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker' > Jan 7 09:48:03 thor abrt-server[2375084]: Not saving repeating crash in > '/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker' > Jan 7 09:48:03 thor systemd[1]: ovirt-ha-broker.service: Main process > exited, code=exited, status=1/FAILURE > Jan 7 09:48:03 thor systemd[1]: ovirt-ha-broker.service: Failed with > result 'exit-code'. > Jan 7 09:48:04 thor systemd[1]: ovirt-ha-broker.service: Service > RestartSec=100ms expired, scheduling restart. > Jan 7 09:48:04 thor systemd[1]: ovirt-ha-broker.service: Scheduled > restart job, restart counter is at 44569. > Jan 7 09:48:04 thor systemd[1]: Stopped oVirt Hosted Engine High > Availability Communications Broker. > Jan 7 09:48:04 thor systemd[1]: Started oVirt Hosted Engine High > Availability Communications Broker. > Jan 7 09:48:06 thor systemd[1]: ovirt-ha-agent.service: Service > RestartSec=10s expired, scheduling restart. > Jan 7 09:48:06 thor systemd[1]: ovirt-ha-agent.service: Scheduled restart > job, restart counter is at 22270. > Jan 7 09:48:06 thor systemd[1]: Stopped oVirt Hosted Engine High > Availability Monitoring Agent. > Jan 7 09:48:06 thor systemd[1]: Started oVirt Hosted Engine High > Availability Monitoring Agent. > Jan 7 09:48:06 thor systemd[1]: Started Session c44598 of user root. > Jan 7 09:48:06 thor systemd[1]: session-c44598.scope: Succeeded. > Jan 7 09:48:06 thor journal[2375091]: ovirt-ha-agent > ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to > start necessary monitors > Jan 7 09:48:06 thor journal[2375091]: ovirt-ha-agent > ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent call > last):#012 File > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", > line 85, in start_monitor#012 response = self._proxy.start_monitor(type, > options)#012 File "/usr/lib64/python3.6/xmlrpc/client.py", line 1112, in > __call__#012 return self.__send(self.__name, args)#012 File > "/usr/lib64/python3.6/xmlrpc/client.py", line 1452, in __request#012 > verbose=self.__verbose#012 File "/usr/lib64/python3.6/xmlrpc/client.py", > line 1154, in request#012 return self.single_request(host, handler, > request_body, verbose)#012 File "/usr/lib64/python3.6/xmlrpc/client.py", > line 1166, in single_request#012 http_conn = self.send_request(host, > handler, request_body, verbose)#012 File > "/usr/lib64/python3.6/xmlrpc/client.py", line 1279, in send_request#012 > self.send_content(connection, request_body)#012 File > "/usr/lib64/python3.6/xmlrpc/client.py", line 1309, in send_content#012 > connection.endheaders(request_body)#012 File > "/usr/lib64/python3.6/http/client.py", line 1264, in endheaders#012 > self._send_output(message_body, encode_chunked=encode_chunked)#012 File > "/usr/lib64/python3.6/http/client.py", line 1040, in _send_output#012 > self.send(msg)#012 File "/usr/lib64/python3.6/http/client.py", line 978, > in send#012 self.connect()#012 File > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py", > line 74, in connect#012 > self.sock.connect(base64.b16decode(self.host))#012FileNotFoundError: > [Errno 2] No such file or directory#012#012During handling of the above > exception, another exception occurred:#012#012Traceback (most recent call > last):#012 File > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", > line 131, in _run_agent#012 return action(he)#012 File > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", > line 55, in action_proper#012 return he.start_monitoring()#012 File > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", > line 437, in start_monitoring#012 self._initialize_broker()#012 File > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", > line 561, in _initialize_broker#012 m.get('options', {}))#012 File > "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", > line 91, in start_monitor#012 ).format(t=type, o=options, > e=e)#012ovirt_hosted_engine_ha.lib.exceptions.RequestError: brokerlink - > failed to start monitor via ovirt-ha-broker: [Errno 2] No such file or > directory, [monitor: 'network', options: {'addr': '172.16.100.1', > 'network_test': 'dns', 'tcp_t_address': '', 'tcp_t_port': ''}] > Jan 7 09:48:06 thor journal[2375091]: ovirt-ha-agent > ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent > Jan 7 09:48:06 thor systemd[1]: ovirt-ha-agent.service: Main process > exited, code=exited, status=157/n/a > Jan 7 09:48:06 thor systemd[1]: ovirt-ha-agent.service: Failed with > result 'exit-code'. > #### > > Is their a means to start a VM... when oVirt / engine is offline? > > > -- > penguinpages > _______________________________________________ > CentOS-virt mailing list > CentOS-virt at centos.org > https://lists.centos.org/mailman/listinfo/centos-virt >-- Sandro Bonazzola MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV Red Hat EMEA <https://www.redhat.com/> sbonazzo at redhat.com <https://www.redhat.com/> *Red Hat respects your work life balance. Therefore there is no need to answer this email out of your office hours. <https://mojo.redhat.com/docs/DOC-1199578>* -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20210118/370a9f89/attachment-0005.html>