Andrea Bolognani
2016-Sep-06 12:56 UTC
Re: [libvirt-users] systemctl libvirt-guests.service fails to start during host boot
On Thu, 2016-08-11 at 20:41 +0200, Benoit wrote:> Thanks Andrea, > > Yes I confirm to you that I have enabled libvirt as well. > > I don't see any error right now but I have to investigate a little bit more. > > My guests are in autostart in virsh so everything is fine on this, the > only issue I got is in case of shutdown. > > The strange thing is that sometimes libvirt-guests is up and running > sometimes failed...I'm trying to track this down, but 1) I can't reproduce it on my setup and 2) it doesn't make any sense :) It's almost as if libvirtd.service would be started during boot, but not every single time. Except you always have it running after boot, right? And After=libvirtd.service tells systemd *exactly* to wait until libvirtd has started before starting libvirt-guests, so the situation you apparently find yourself in should never, ever happen, barring a systemd bug. Since you're running an Arch variant, I'm going to assume all your software is basically at the latest version all around, isn't it?> just for info this is my libvirt-guests.service (by default) > > [Unit] > Description=Suspend Active Libvirt Guests > After=network.target libvirtd.service time-sync.target > Documentation=man:libvirtd(8) > Documentation=http://libvirt.org > > [Service] > EnvironmentFile=-/etc/conf.d/libvirt-guests > # Hack just call traditional service until we factor > # out the code > ExecStart=/usr/lib/libvirt/libvirt-guests.sh start > ExecStop=/usr/lib/libvirt/libvirt-guests.sh stop > Type=oneshot > RemainAfterExit=yes > StandardOutput=journal+console > TimeoutStopSec=0 > > [Install] > WantedBy=multi-user.targetCan you try adding [Unit] Requires=libvirtd.service to your libvirt-guests.service via 'systemctl edit' and see if it helps? I don't think it should be needed for your use case, but it's probably a good idea to have it there anyways. -- Andrea Bolognani / Red Hat / Virtualization
Benoit
2016-Sep-15 16:10 UTC
Re: [libvirt-users] systemctl libvirt-guests.service fails to start during host boot
Hi Andrea,> Since you're running an Arch variant, I'm going to assume > all your software is basically at the latest version all > around, isn't it?Yep exactly, all software / kernel are in latest version> Can you try adding > > [Unit] > Requires=libvirtd.service > > to your libvirt-guests.service via 'systemctl edit' and > see if it helps? I don't think it should be needed for your > use case, but it's probably a good idea to have it there > anyways.I have tried and as you expected it didn't changed anything, same issue. I made more test and it is very reproductible, let me give you my steps to reproduce it : -I shutdown -r now my host. -when it restarted, first thing to check is : systemctl status libvirt-guests.service, I got : libvirt-guests.service - Suspend Active Libvirt Guests Loaded: loaded (/usr/lib/systemd/system/libvirt-guests.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Thu 2016-09-15 17:40:07 CEST; 1min 47s ago Docs: man:libvirtd(8) http://libvirt.org Main PID: 329 (code=exited, status=1/FAILURE) Sep 15 17:40:07 libvirt-guests.sh[329]: Resuming guest : error: failed to connect to the hypervisor Sep 15 17:40:07 libvirt-guests.sh[329]: error: no valid connection Sep 15 17:40:07 libvirt-guests.sh[329]: error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory Sep 15 17:40:07 libvirt-guests.sh[329]: Resuming guest : error: failed to connect to the hypervisor Sep 15 17:40:07 libvirt-guests.sh[329]: error: no valid connection Sep 15 17:40:07 libvirt-guests.sh[329]: error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory Sep 15 17:40:07 systemd[1]: libvirt-guests.service: Main process exited, code=exited, status=1/FAILURE Sep 15 17:40:07 systemd[1]: Failed to start Suspend Active Libvirt Guests. Sep 15 17:40:07 systemd[1]: libvirt-guests.service: Unit entered failed state. Sep 15 17:40:07 systemd[1]: libvirt-guests.service: Failed with result 'exit-code'. Even if the service is in FAILED mode, my guests are started correctly. If I do a systemctl restart libvirt-guests.service, everything is good : libvirt-guests.service - Suspend Active Libvirt Guests Loaded: loaded (/usr/lib/systemd/system/libvirt-guests.service; enabled; vendor preset: disabled) Active: active (exited) since Thu 2016-09-15 17:42:01 CEST; 9min ago Docs: man:libvirtd(8) http://libvirt.org Main PID: 605 (code=exited, status=0/SUCCESS) Sep 15 17:42:01 systemd[1]: Starting Suspend Active Libvirt Guests... Sep 15 17:42:01 systemd[1]: Started Suspend Active Libvirt Guests. Now if I do a shutdown -r now again, I can see both in my libvirt logs + in my guests log that an ACPI commands has been sent from libvirt and my guests shutdown gently, so everything is ok. When I start again, and try to do a systemctl status libvirt-guests.service, I got the same FAILED status as before.... I made a research on my logs : Sep 15 17:39:29 libvirt-guests.sh[1164]: Shutting down guests on default URI... Sep 15 17:39:29 libvirt-guests.sh[1164]: Starting shutdown on guest: ape Sep 15 17:39:30 libvirt-guests.sh[1164]: Waiting for guest ape to shut down, 30 seconds left Sep 15 17:39:35 libvirt-guests.sh[1164]: Waiting for guest ape to shut down, 25 seconds left Sep 15 17:39:36 libvirtd[326]: internal error: End of file from monitor Sep 15 17:39:36 libvirtd[326]: Cannot write data: Broken pipe Sep 15 17:39:36 libvirt-guests.sh[1164]: Shutdown of guest ape complete. Sep 15 17:39:36 libvirt-guests.sh[1164]: Starting shutdown on guest: om Sep 15 17:39:37 libvirt-guests.sh[1164]: Waiting for guest om to shut down, 30 seconds left Sep 15 17:39:40 libvirtd[326]: internal error: End of file from monitor Sep 15 17:39:40 libvirtd[326]: Cannot write data: Broken pipe Sep 15 17:39:40 libvirt-guests.sh[1164]: Shutdown of guest om complete. Sep 15 17:40:07 libvirt-guests.sh[329]: Resuming guests on default URI... Sep 15 17:40:07 libvirt-guests.sh[329]: Resuming guest : error: failed to connect to the hypervisor Sep 15 17:40:07 libvirt-guests.sh[329]: error: no valid connection Sep 15 17:40:07 libvirt-guests.sh[329]: error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory Sep 15 17:40:07 libvirt-guests.sh[329]: Resuming guest : error: failed to connect to the hypervisor Sep 15 17:40:07 libvirt-guests.sh[329]: error: no valid connection Sep 15 17:40:07 libvirt-guests.sh[329]: error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory Sep 15 17:40:07 systemd[1]: libvirt-guests.service: Main process exited, code=exited, status=1/FAILURE Sep 15 17:40:07 systemd[1]: libvirt-guests.service: Unit entered failed state. Sep 15 17:40:07 systemd[1]: libvirt-guests.service: Failed with result 'exit-code'. Sep 15 17:40:07 libvirtd[325]: libvirt version: 2.1.0 Sep 15 17:40:07 libvirtd[325]: hostname: oo.localdomain Sep 15 17:40:07 libvirtd[325]: Cannot check dnsmasq binary /sbin/dnsmasq: No such file or directory Sep 15 17:40:07 libvirtd[325]: direct firewall backend requested, but /sbin/ebtables is not available: No such file or directory Sep 15 17:40:07 libvirtd[325]: internal error: Failed to initialize a valid firewall backend Sep 15 17:40:07 libvirtd[325]: internal error: Failed to find path for dmidecode binary Sep 15 17:40:07 libvirtd[325]: internal error: Failed to find path for dmidecode binary Sep 15 17:40:08 virtlogd[421]: libvirt version: 2.1.0 We see the same error as systemctl status plus some error like " internal error: End of file from monitor" and others type. Don't know if I have anything to take a look there ? Finally, I also notice some very strange thing: If I stop the service with systemctl stop libvirt-guests.service it tooks a while to execute the stop command and fails. Sep 15 18:00:44 libvirt-guests.sh[978]: Starting shutdown on guest: ape Sep 15 18:00:45 libvirt-guests.sh[978]: Waiting for guest ape to shut down, 30 seconds left Sep 15 18:00:50 libvirt-guests.sh[978]: Waiting for guest ape to shut down, 25 seconds left Sep 15 18:00:55 libvirt-guests.sh[978]: Waiting for guest ape to shut down, 20 seconds left Sep 15 18:01:00 libvirt-guests.sh[978]: Waiting for guest ape to shut down, 15 seconds left Sep 15 18:01:06 libvirt-guests.sh[978]: Waiting for guest ape to shut down, 10 seconds left Sep 15 18:01:11 libvirt-guests.sh[978]: Waiting for guest ape to shut down, 5 seconds left Sep 15 18:01:15 libvirt-guests.sh[978]: Shutdown of guest ape failed to complete in time. So my guests are not able to shutdown properly but when I shutdown -r now my host I can see that my guests have been shutdown gently, I guess there is something related on my issue.... Many thanks for your help again
Andrea Bolognani
2016-Sep-26 15:43 UTC
Re: [libvirt-users] systemctl libvirt-guests.service fails to start during host boot
On Thu, 2016-09-15 at 18:10 +0200, Benoit wrote:> Hi Andrea,Hi, sorry for taking so long to reply. I'm still unable to reproduce your failure, which of course makes it very difficult to debug :( But I might have a couple leads, so let's try and move forward with that.> > Since you're running an Arch variant, I'm going to assume > > all your software is basically at the latest version all > > around, isn't it? > > Yep exactly, all software / kernel are in latest version>From the log below it looks like you're running 2.1.0, whichmight be the latest version your distribution provides, but the same is not true when it comes to upstream development.> > Can you try adding > > > > [Unit] > > Requires=libvirtd.service > > > > to your libvirt-guests.service via 'systemctl edit' and > > see if it helps? I don't think it should be needed for your > > use case, but it's probably a good idea to have it there > > anyways. > > I have tried and as you expected it didn't changed anything, same issue.Can you please provide the output of # systemctl cat libvirt-guests.service and # systemctl show libvirt-guests.service as well? Those could contain helpful information.> I made more test and it is very reproductible, let me give you my steps > to reproduce it : > > -I shutdown -r now my host. > > -when it restarted, first thing to check is : systemctl status > libvirt-guests.service, I got : > > libvirt-guests.service - Suspend Active Libvirt Guests > Loaded: loaded (/usr/lib/systemd/system/libvirt-guests.service; > enabled; vendor preset: disabled)>From the output of systemctl it looks like you didn't addthe Requires=libvirtd.service relationship. Maybe you just removed it after testing?> Active: failed (Result: exit-code) since Thu 2016-09-15 17:40:07 > CEST; 1min 47s ago > Docs: man:libvirtd(8) > http://libvirt.org > Main PID: 329 (code=exited, status=1/FAILURE) > > Sep 15 17:40:07 libvirt-guests.sh[329]: Resuming guest : error: failed > to connect to the hypervisor > Sep 15 17:40:07 libvirt-guests.sh[329]: error: no valid connection > Sep 15 17:40:07 libvirt-guests.sh[329]: error: Failed to connect socket > to '/var/run/libvirt/libvirt-sock': No such file or directory > Sep 15 17:40:07 libvirt-guests.sh[329]: Resuming guest : error: failed > to connect to the hypervisor > Sep 15 17:40:07 libvirt-guests.sh[329]: error: no valid connection > Sep 15 17:40:07 libvirt-guests.sh[329]: error: Failed to connect socket > to '/var/run/libvirt/libvirt-sock': No such file or directory > Sep 15 17:40:07 systemd[1]: libvirt-guests.service: Main process exited, > code=exited, status=1/FAILURE > Sep 15 17:40:07 systemd[1]: Failed to start Suspend Active Libvirt Guests. > Sep 15 17:40:07 systemd[1]: libvirt-guests.service: Unit entered failed > state. > Sep 15 17:40:07 systemd[1]: libvirt-guests.service: Failed with result > 'exit-code'. > > Even if the service is in FAILED mode, my guests are started correctly.Is it possible that you have configured some (all?) of your guests to be autostarted? That would explain why the guests are running after reboot even though libvirt-guests failed to connect to libvirtd. Even then, libvirt-guests is written in such a way that the fact a guest has already been started should not result in failure. I've checked, and it looks to me like the socket is created very early in the libvirtd startup process, way before attempting to start autostarted guests, so I don't see how that could cause issues. Two things stand out from the log: 1) Resuming guest : error: failed to connect to the hypervisor That progress message is supposed to contain the guest name right after "guest", but you have an empty string instead. That's consistent with the script being unable to connect to libvirtd 2) the script can't connect to libvirtd which is extremely weird because, right before looping throught the guests and starting them, a connectivity check is performed to make sure libvirtd can be reached Can you please provide a full, verbose libvirtd log[1]? Can you also add[2] set -x to /usr/libexec/libvirt-guests.sh and then provide the output of # systemctl -b 0 -u libvirt-guests.service after rebooting the host?> If I do a systemctl restart libvirt-guests.service, everything is good : > libvirt-guests.service - Suspend Active Libvirt Guests > Loaded: loaded (/usr/lib/systemd/system/libvirt-guests.service; > enabled; vendor preset: disabled) > Active: active (exited) since Thu 2016-09-15 17:42:01 CEST; 9min ago > Docs: man:libvirtd(8) > http://libvirt.org > Main PID: 605 (code=exited, status=0/SUCCESS) > > Sep 15 17:42:01 systemd[1]: Starting Suspend Active Libvirt Guests... > Sep 15 17:42:01 systemd[1]: Started Suspend Active Libvirt Guests. > > Now if I do a shutdown -r now again, I can see both in my libvirt logs + > in my guests log that an ACPI commands has been sent from libvirt and my > guests shutdown gently, so everything is ok. > > When I start again, and try to do a systemctl status > libvirt-guests.service, I got the same FAILED status as before.... > > I made a research on my logs : > > Sep 15 17:39:29 libvirt-guests.sh[1164]: Shutting down guests on default > URI... > Sep 15 17:39:29 libvirt-guests.sh[1164]: Starting shutdown on guest: ape > Sep 15 17:39:30 libvirt-guests.sh[1164]: Waiting for guest ape to shut > down, 30 seconds leftLooks like you changed at least a couple of settings here, eg. the timeout. Can you please provide the output of $ grep -Ev '^(#|$)' /etc/sysconfig/libvirt-guests so we can be sure we're on the same page?> Sep 15 17:39:35 libvirt-guests.sh[1164]: Waiting for guest ape to shut > down, 25 seconds left > Sep 15 17:39:36 libvirtd[326]: internal error: End of file from monitor > Sep 15 17:39:36 libvirtd[326]: Cannot write data: Broken pipe[...]> We see the same error as systemctl status plus some error like " > internal error: End of file from monitor" and others type. > Don't know if I have anything to take a look there ?That might be a red herring, I think it's just the QEMU process being killed during shutdown and thus ending communication with libvirtd abruptly.> Finally, I also notice some very strange thing: > If I stop the service with systemctl stop libvirt-guests.service it > tooks a while to execute the stop command and fails. > > Sep 15 18:00:44 libvirt-guests.sh[978]: Starting shutdown on guest: ape > Sep 15 18:00:45 libvirt-guests.sh[978]: Waiting for guest ape to shut > down, 30 seconds left > Sep 15 18:00:50 libvirt-guests.sh[978]: Waiting for guest ape to shut > down, 25 seconds left > Sep 15 18:00:55 libvirt-guests.sh[978]: Waiting for guest ape to shut > down, 20 seconds left > Sep 15 18:01:00 libvirt-guests.sh[978]: Waiting for guest ape to shut > down, 15 seconds left > Sep 15 18:01:06 libvirt-guests.sh[978]: Waiting for guest ape to shut > down, 10 seconds left > Sep 15 18:01:11 libvirt-guests.sh[978]: Waiting for guest ape to shut > down, 5 seconds left > Sep 15 18:01:15 libvirt-guests.sh[978]: Shutdown of guest ape failed to > complete in time. > > So my guests are not able to shutdown properly but when I shutdown -r > now my host I can see that my guests have been shutdown gently, I guess > there is something related on my issue....It looks like at least one of the guests is taking awfully long to shut down. What if you bump up the timeout? Does it manage to shut down before reaching it? [1] Follow instructions at http://wiki.libvirt.org/page/DebugLogs [2] When editing the file, add the new instruction *right after* the first line. Of course you should make a backup copy of the file before tweaking it -- Andrea Bolognani / Red Hat / Virtualization
Reasonably Related Threads
- Re: systemctl libvirt-guests.service fails to start during host boot
- Re: Ubuntu 16.04 libvirt-guests.sh [6917] - running guests under URI address default: no running guests
- Re: Ubuntu 16.04 libvirt-guests.sh [6917] - running guests under URI address default: no running guests
- Re: Ubuntu 16.04 libvirt-guests.sh [6917] - running guests under URI address default: no running guests
- Xen died - Fedora upgrade from 21 to 26