On Tue, Jun 06, 2023 at 04:56:38PM -0400, Jerry Buburuz wrote:> Recently both virsh stopped talking to the libvirtd. Both stopped within a > few days of each other.I've run into exactly the same problem. I'm running libvirt (libvirt-9.0.0-3.fc38.x86_64) on Fedora 38. On Fedora, libvirtd is configured by default to use socket activation and is run with the `--timeout 120` option. After some recent upgrades, I'm seeing the exact same symptoms that Jerry described -- virsh commands simply get stuck at same call to `poll()`. It looks like libvirtd is either crashing or failing to start, because when virsh is in this state the `libvirtd` process isn't running. This makes it *sound* like a systemd problem, but I'm not seeing errors anywhere -- either from libvirtd or from systemd. I've worked around the problem locally by re-configuring libvirtd to run persistently rather than using socket activation: systemctl disable --now libvirtd{,-ro,-admin}.socket cat > /etc/systemd/system/libvirtd.service.d/override.conf <<EOF [Service] EnvironmentFile EOF systemctl restart libvirtd Package versions in case this helps correlate something: - libvirt-9.0.0-3.fc38.x86_64 - systemd-253.5-1.fc38.x86_64 - kernel-6.3.6-200.fc38.x86_64 Libvirt uri: qemu:///system -- Lars Kellogg-Stedman <lars at redhat.com> | larsks @ {irc,twitter,github} http://blog.oddbit.com/ | N1LKS
Thank you Lars. My next step is to try TCP rather than unix socket. Just to clarify: * I am using ubuntu 22.04 LTS * systemd shows libvirtd no errors and its running and creates unix sockets in /run/libvirt/libvirt-sock * none of the services are failing. * I have been trying to turn on every debugging feature possible, no errors with virsh or libvirtd services. * recently tried gdb attaching to libvirtd and virsh process and not seeing any errors. Recently tried a identical vm with 22.04 and all patches and compared permissions, files opened(lsof) , logs ..etc. THe new vm virsh connects not problem. My two existing hyperviros are still dead. The only difference between my new test VM and the dead hypervisors if the problem hypervisors use a mounted cephfs to store virtual machines. I have not tried to unmount the cephfs yet. Maybe its causing delays in something? The virsh and /etc/libvirtd/ is local to the hypervisor. I only use the cephfs to store the images. This problem started around the same time my cephfs storage had issues. thanks jerry Lars Kellogg-Stedman> On Tue, Jun 06, 2023 at 04:56:38PM -0400, Jerry Buburuz wrote: >> Recently both virsh stopped talking to the libvirtd. Both stopped within >> a >> few days of each other. > > I've run into exactly the same problem. > > I'm running libvirt (libvirt-9.0.0-3.fc38.x86_64) on Fedora 38. On > Fedora, libvirtd is configured by default to use socket activation and > is run with the `--timeout 120` option. > > After some recent upgrades, I'm seeing the exact same symptoms that > Jerry described -- virsh commands simply get stuck at same call to > `poll()`. > > It looks like libvirtd is either crashing or failing to start, because > when virsh is in this state the `libvirtd` process isn't running. This > makes it *sound* like a systemd problem, but I'm not seeing errors > anywhere -- either from libvirtd or from systemd. > > I've worked around the problem locally by re-configuring libvirtd to > run persistently rather than using socket activation: > > systemctl disable --now libvirtd{,-ro,-admin}.socket > > cat > /etc/systemd/system/libvirtd.service.d/override.conf <<EOF > [Service] > EnvironmentFile> EOF > > systemctl restart libvirtd > > Package versions in case this helps correlate something: > > - libvirt-9.0.0-3.fc38.x86_64 > - systemd-253.5-1.fc38.x86_64 > - kernel-6.3.6-200.fc38.x86_64 > > Libvirt uri: qemu:///system > > -- > Lars Kellogg-Stedman <lars at redhat.com> | larsks @ {irc,twitter,github} > http://blog.oddbit.com/ | N1LKS > >
Just found my issue. After I removed the cephfs mounts it worked! I will debug ceph. I assumed because I could touch files on mounted cephfs it was working. Now virsh list works! thanks jerry Lars Kellogg-Stedman> On Tue, Jun 06, 2023 at 04:56:38PM -0400, Jerry Buburuz wrote: >> Recently both virsh stopped talking to the libvirtd. Both stopped within >> a >> few days of each other. > > I've run into exactly the same problem. > > I'm running libvirt (libvirt-9.0.0-3.fc38.x86_64) on Fedora 38. On > Fedora, libvirtd is configured by default to use socket activation and > is run with the `--timeout 120` option. > > After some recent upgrades, I'm seeing the exact same symptoms that > Jerry described -- virsh commands simply get stuck at same call to > `poll()`. > > It looks like libvirtd is either crashing or failing to start, because > when virsh is in this state the `libvirtd` process isn't running. This > makes it *sound* like a systemd problem, but I'm not seeing errors > anywhere -- either from libvirtd or from systemd. > > I've worked around the problem locally by re-configuring libvirtd to > run persistently rather than using socket activation: > > systemctl disable --now libvirtd{,-ro,-admin}.socket > > cat > /etc/systemd/system/libvirtd.service.d/override.conf <<EOF > [Service] > EnvironmentFile> EOF > > systemctl restart libvirtd > > Package versions in case this helps correlate something: > > - libvirt-9.0.0-3.fc38.x86_64 > - systemd-253.5-1.fc38.x86_64 > - kernel-6.3.6-200.fc38.x86_64 > > Libvirt uri: qemu:///system > > -- > Lars Kellogg-Stedman <lars at redhat.com> | larsks @ {irc,twitter,github} > http://blog.oddbit.com/ | N1LKS > >