Hi, I would like some help to troubleshoot the problem I have been having lately with my VM host, which contains 5 VMs, one of which is for pi-hole, unbound services. It has been a relatively common occurrence in the last few weeks for me to find that the host machine has lost its network when I get back home from work. Restoring the VM/VMs do not fix the problem, the host needs to be restarted for a fix, otherwise there is both loss of name resolution, as well as an internet connection; I cannot ping even IPs such as 8.8.8.8. Since I use the pi-hole VM as the DNS server for my LAN, this means that my whole LAN gets disconnected from internet, until the host machine is rebooted. The host machine has a little complicated network setup: the two gigabit connections are bonded and bridged to the VMs; however this set up has been serving me so well for several years now. The problem, on the other hand, appeared a few weeks ago. This doesn't happen every day but often enough to be annoying and disruptive for my family. My question is, how can I troubleshoot this problem and figure out whether it is truly due to network bridging somehow collapsing or not? I tried to find some log files but all I could find were the /var/log/libvirt/qemu/$VM files, and the particular log file for the pi-hole VM reported the following lines; however, I am not sure if they are associated with a real crash or just due to shutting down and restarting the host (please excuse the word-wrapping): char device redirected to /dev/pts/2 (label charserial0) qxl_send_events: spice-server bug: guest stopped, ignoring 2022-01-20T23:41:17.012445Z qemu-system-x86_64: terminating on signal 15 from pid 1 (/sbin/init) 2022-01-20 23:41:17.716+0000: shutting down, reason=crashed 2022-01-20 23:42:46.059+0000: starting up libvirt version: 7.10.0, qemu version: 6.2.0, kernel: 5.10.89-1-MANJARO, hostname: -redacted- Please excuse my ignorance but is there a way to restart the networking without rebooting the host machine? This will not solve my problem since I won't be able to reach to the host remotely if the networking is down. The real solution would be preventing these network crashes and the first step in that would be effective troubleshooting in my opinion. Any input/guidance will be greatly appreciated. I can provide more info about my host/VM(s) if the above is not adequate. Thanks, Hakan Duran -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 195 bytes Desc: not available URL: <http://listman.redhat.com/archives/libvirt-users/attachments/20220121/723a9f4f/attachment.sig>
Martin Kletzander
2022-Jan-24 09:35 UTC
frequent network collapse possibly due to bridging
On Fri, Jan 21, 2022 at 08:42:58AM -0600, Hakan E. Duran wrote:>Hi, > >I would like some help to troubleshoot the problem I have been having >lately with my VM host, which contains 5 VMs, one of which is for >pi-hole, unbound services. It has been a relatively common occurrence in >the last few weeks for me to find that the host machine has lost its >network when I get back home from work. Restoring the VM/VMs do not fix >the problem, the host needs to be restarted for a fix, otherwise there >is both loss of name resolution, as well as an internet connection; I >cannot ping even IPs such as 8.8.8.8. Since I use the pi-hole VM as the DNS >server for my LAN, this means that my whole LAN gets disconnected from >internet, until the host machine is rebooted. The host machine has a >little complicated network setup: the two gigabit connections are bonded >and bridged to the VMs; however this set up has been serving me so well >for several years now. The problem, on the other hand, appeared a few >weeks ago. This doesn't happen every day but often enough to be annoying >and disruptive for my family. >Always good to check what has changed those weeks ago, but I understand it is difficult to find out what you were updating and where.>My question is, how can I troubleshoot this problem and figure out >whether it is truly due to network bridging somehow collapsing or not? I >tried to find some log files but all I could find were the >/var/log/libvirt/qemu/$VM files, and the particular log file for the pi-hole >VM reported the following lines; however, I am not sure if they are >associated with a real crash or just due to shutting down and restarting >the host (please excuse the word-wrapping): > >char device redirected to /dev/pts/2 (label charserial0) >qxl_send_events: spice-server bug: guest stopped, ignoring >2022-01-20T23:41:17.012445Z qemu-system-x86_64: terminating on signal 15 from pid 1 (/sbin/init)Probably restarting the host as it got SIGTERM'd by init. Maybe it was restarted in a bad time and there is some inconsistency on the disk? Using something like libvirt-guests which can manage your machines when rebooting would be a good idea.>2022-01-20 23:41:17.716+0000: shutting down, reason=crashed >2022-01-20 23:42:46.059+0000: starting up libvirt version: 7.10.0, qemu >version: 6.2.0, kernel: 5.10.89-1-MANJARO, hostname: -redacted- > >Please excuse my ignorance but is there a way to restart the >networking without rebooting the host machine? This will not solve myYou can do: virsh net-destroy <network_name> virsh net-start <network_name> but depending on what the network looks like, how it is set up etc. you might need to restart some of the VMs or manually plug them in.>problem since I won't be able to reach to the host remotely if the >networking is down. The real solution would be preventing these network >crashes and the first step in that would be effective troubleshooting in >my opinion. Any input/guidance will be greatly appreciated. > >I can provide more info about my host/VM(s) if the above is not adequate. >I'm not sure how much more I can help as I do not understand what is the actual setup. What I would do is try to figure out what exactly happens when it breaks and then go from that (setting up logging etc.), just general tips I guess.>Thanks, > >Hakan Duran >-------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: <http://listman.redhat.com/archives/libvirt-users/attachments/20220124/ae16f059/attachment.sig>