I was sort of hoping that is was something simple like setting the "do_the_right_thing" flag. The libvirtd kicks out 2014-10-31 11:58:57.111+0000: 8741: error : virRegisterNetworkDriver:549 : driver in virRegisterNetworkDriver must not be NULL 2014-10-31 11:59:29.379+0000: 8840: error : virRegisterNetworkDriver:549 : driver in virRegisterNetworkDriver must not be NULL 2014-10-31 12:02:03.419+0000: 14712: error : virRegisterNetworkDriver:549 : driver in virRegisterNetworkDriver must not be NULL 2014-10-31 12:02:20.547+0000: 14712: error : virNetlinkEventCallback:343 : nl_recv returned with error: No buffer space available 2014-10-31 12:02:21.873+0000: 17428: error : virRegisterNetworkDriver:549 : driver in virRegisterNetworkDriver must not be NULL 2014-10-31 12:03:06.721+0000: 17428: error : virNetlinkEventCallback:343 : nl_recv returned with error: No buffer space available (I deleted the other errors caused by trying to load drivers that don't exits). I reboot 3 systems mirantis_[457] /var/log/libxl/* kicks out. mirantis_4.log:libxl: error: libxl_dm.c:1311:libxl__destroy_device_model: Device Model already exited mirantis_5.log:libxl: error: libxl_dm.c:1311:libxl__destroy_device_model: Device Model already exited These are more interesting. I wonder if libxl has a race condition. On 10/31/2014 02:58 AM, Martin Kletzander wrote:> On Thu, Oct 30, 2014 at 01:00:04PM -0400, Alvin Starr wrote: >> >> If I reboot a single vm through libvirt/libxl the system reboots >> normally. >> If I have several vm's reboot at the same time then The systems go into >> a paused state and do not reboot. >> I then have to kill them via xl and restart them. >> > > Do logs [1] uncover something? > > Martin > > [1] http://libvirt.org/logging.html-- Alvin Starr || voice: (905)513-7688 Netvel Inc. || Cell: (416)806-0133 alvin@netvel.net ||
On Fri, Oct 31, 2014 at 08:34:48AM -0400, Alvin Starr wrote:>I was sort of hoping that is was something simple like setting the >"do_the_right_thing" flag. > > >The libvirtd kicks out >2014-10-31 11:58:57.111+0000: 8741: error : virRegisterNetworkDriver:549 >: driver in virRegisterNetworkDriver must not be NULL >2014-10-31 11:59:29.379+0000: 8840: error : virRegisterNetworkDriver:549 >: driver in virRegisterNetworkDriver must not be NULL >2014-10-31 12:02:03.419+0000: 14712: error : >virRegisterNetworkDriver:549 : driver in virRegisterNetworkDriver must >not be NULL >2014-10-31 12:02:20.547+0000: 14712: error : virNetlinkEventCallback:343 >: nl_recv returned with error: No buffer space available >2014-10-31 12:02:21.873+0000: 17428: error : >virRegisterNetworkDriver:549 : driver in virRegisterNetworkDriver must >not be NULL >2014-10-31 12:03:06.721+0000: 17428: error : virNetlinkEventCallback:343 >: nl_recv returned with error: No buffer space available > > >(I deleted the other errors caused by trying to load drivers that don't >exits). > >I reboot 3 systems mirantis_[457] > >/var/log/libxl/* kicks out. >mirantis_4.log:libxl: error: >libxl_dm.c:1311:libxl__destroy_device_model: Device Model already exited >mirantis_5.log:libxl: error: >libxl_dm.c:1311:libxl__destroy_device_model: Device Model already exited > >These are more interesting. > >I wonder if libxl has a race condition. >I'm completely unaware of libxl works, so I can only guess. But this looks like an error in libxl. Maybe someone else has an idea? Martin>On 10/31/2014 02:58 AM, Martin Kletzander wrote: >> On Thu, Oct 30, 2014 at 01:00:04PM -0400, Alvin Starr wrote: >>> >>> If I reboot a single vm through libvirt/libxl the system reboots >>> normally. >>> If I have several vm's reboot at the same time then The systems go into >>> a paused state and do not reboot. >>> I then have to kill them via xl and restart them. >>> >> >> Do logs [1] uncover something? >> >> Martin >> >> [1] http://libvirt.org/logging.html > > >-- >Alvin Starr || voice: (905)513-7688 >Netvel Inc. || Cell: (416)806-0133 >alvin@netvel.net || >
I am not sure if this is a libvirt or libxl problem. It looks as if each running vm has an associated event thread and this thread calls the libxl_destroy to clean up the rebooting processes. Possibly these threads should lock to insure synchronization and allow only one reboot or termination at a time. I will try to add some diagnostics and run a few more tests. On 10/31/2014 09:28 AM, Martin Kletzander wrote:> On Fri, Oct 31, 2014 at 08:34:48AM -0400, Alvin Starr wrote: >> I was sort of hoping that is was something simple like setting the >> "do_the_right_thing" flag. >> >> >> The libvirtd kicks out >> 2014-10-31 11:58:57.111+0000: 8741: error : virRegisterNetworkDriver:549 >> : driver in virRegisterNetworkDriver must not be NULL >> 2014-10-31 11:59:29.379+0000: 8840: error : virRegisterNetworkDriver:549 >> : driver in virRegisterNetworkDriver must not be NULL >> 2014-10-31 12:02:03.419+0000: 14712: error : >> virRegisterNetworkDriver:549 : driver in virRegisterNetworkDriver must >> not be NULL >> 2014-10-31 12:02:20.547+0000: 14712: error : virNetlinkEventCallback:343 >> : nl_recv returned with error: No buffer space available >> 2014-10-31 12:02:21.873+0000: 17428: error : >> virRegisterNetworkDriver:549 : driver in virRegisterNetworkDriver must >> not be NULL >> 2014-10-31 12:03:06.721+0000: 17428: error : virNetlinkEventCallback:343 >> : nl_recv returned with error: No buffer space available >> >> >> (I deleted the other errors caused by trying to load drivers that don't >> exits). >> >> I reboot 3 systems mirantis_[457] >> >> /var/log/libxl/* kicks out. >> mirantis_4.log:libxl: error: >> libxl_dm.c:1311:libxl__destroy_device_model: Device Model already exited >> mirantis_5.log:libxl: error: >> libxl_dm.c:1311:libxl__destroy_device_model: Device Model already exited >> >> These are more interesting. >> >> I wonder if libxl has a race condition. >> > > I'm completely unaware of libxl works, so I can only guess. But this > looks like an error in libxl. Maybe someone else has an idea? > > Martin > >> On 10/31/2014 02:58 AM, Martin Kletzander wrote: >>> On Thu, Oct 30, 2014 at 01:00:04PM -0400, Alvin Starr wrote: >>>> >>>> If I reboot a single vm through libvirt/libxl the system reboots >>>> normally. >>>> If I have several vm's reboot at the same time then The systems go >>>> into >>>> a paused state and do not reboot. >>>> I then have to kill them via xl and restart them. >>>> >>> >>> Do logs [1] uncover something? >>> >>> Martin >>> >>> [1] http://libvirt.org/logging.html >> >> >> -- >> Alvin Starr || voice: (905)513-7688 >> Netvel Inc. || Cell: (416)806-0133 >> alvin@netvel.net || >>-- Alvin Starr || voice: (905)513-7688 Netvel Inc. || Cell: (416)806-0133 alvin@netvel.net ||