We encountered a problem with respect to KVM virtual host restore and NTP. Specifically, our VM test host was shutdown by an extended power outage and when power returned all of the restored guests were immediately shutdown by ntp because the time differential between the restored systems and that of the ntpd sync servers exceeded the panic threshold. This is not an acceptable situation so in the absence of something more elegant we are contemplating shutting down ntpd on virtual hosts and scheduling regular ntpd -q from cron instead. Is there an alternative to this? If not then what would be a recommended scheduling interval? -- *** E-Mail is NOT a SECURE channel *** James B. Byrne mailto:ByrneJB at Harte-Lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3
On Mon, May 28, 2012 at 08:41:30AM -0400, James B. Byrne wrote:> We encountered a problem with respect to KVM virtual host restore and > NTP. Specifically, our VM test host was shutdown by an extended power > outage and when power returned all of the restored guests were > immediately shutdown by ntp because the time differential between the > restored systems and that of the ntpd sync servers exceeded the panic > threshold.Umm, ntp won't shut down your guests unless you've done something non-standard. NTP inside the guest will abort because of the time difference but the guest will keep running... just with the wrong time. See the "-g" option: -g Normally, ntpd exits with a message to the system log if the offset exceeds the panic threshold, which is 1000 s by default. This option allows the time to be set to any value without restriction; however, this can happen only once. If the thresh- old is exceeded after that, ntpd will exit with a message to the system log. This option can be used with the -q and -x options. See the tinker command for other options. -- rgds Stephen
On Mon, May 28, 2012 at 5:41 AM, James B. Byrne <byrnejb at harte-lyne.ca> wrote:> We encountered a problem with respect to KVM virtual host restore and > NTP. Specifically, our VM test host was shutdown by an extended power > outage and when power returned all of the restored guests were > immediately shutdown by ntp because the time differential between the > restored systems and that of the ntpd sync servers exceeded the panic > threshold. > > This is not an acceptable situation so in the absence of something > more elegant we are contemplating shutting down ntpd on virtual hosts > and scheduling regular ntpd -q from cron instead. > > Is there an alternative to this? ?If not then what would be a > recommended scheduling interval?The issue has been reported here: http://bugs.centos.org/view.php?id=5726 You might want to try the workaround in note 15092, that is, to add 'tinker panic 0' to the *top* of the /etc/ntp.conf file. Akemi
On Mon, May 28, 2012 08:50, Reindl Harald wrote:> > > Am 28.05.2012 14:41, schrieb James B. Byrne: >> when power returned all of the restored guests were immediately >> shutdown by ntp because the time differential between the >> restored systems and that of the ntpd sync servers exceeded >> the panic threshold. > > how can ntpd shutdown a guest?I have no idea. Perhaps I misunderstood what the ntpd man page referred to as a panic. If it is not ntpd then I still need to discover some way of ensuring that all the KVM guests that were active at the time of a power failure automatically come back on line when the KVM host system starts up. I cannot find any reference to how this is done. Are there any recommended solutions? These systems are on UPS already but the power failure duration exceeded the endurance of the the UPS. -- *** E-Mail is NOT a SECURE channel *** James B. Byrne mailto:ByrneJB at Harte-Lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3
On Mon, May 28, 2012 10:10, Bob Hoffman wrote:> On 5/28/2012 9:59 AM, James B. Byrne wrote: >> On Mon, May 28, 2012 08:50, Reindl Harald wrote: >>> >>> Am 28.05.2012 14:41, schrieb James B. Byrne: >>>> when power returned all of the restored guests were immediately >>>> shutdown by ntp because the time differential between the >>>> restored systems and that of the ntpd sync servers exceeded >>>> the panic threshold. >>> how can ntpd shutdown a guest? >> I have no idea. Perhaps I misunderstood what the ntpd man page >> referred to as a panic. >> >> If it is not ntpd then I still need to discover some way of ensuring >> that all the KVM guests that were active at the time of a power >> failure automatically come back on line when the KVM host system >> starts up. I cannot find any reference to how this is done. >> >> Are there any recommended solutions? These systems are on UPS >> already >> but the power failure duration exceeded the endurance of the the >> UPS. >> > I know when ntp changes the time drastically (like ntpdate) my vsftpd > just commits suicide and dies.. > I imagine something like that is going on with the lvm software either > on the host or the kvm? > > I would suggest turning off ntp before long time shut downs...and > (ugh) manually going through the host and all vms upon turn on and > ntpdate them, then turn ntp on, then reboot to make it all come > back on? > > perhaps a script that turns off ntp, runs ntpdate on host, then on > each kvm upon reboot? this sounds rather scary. > >I cannot find anything in the logs that explain what is happening to me. The evidence I have indicates that when the host kvm system is powered off and restarted then the guests do not restart. This behaviour is at variance with a controlled shutdown wherein active guests are (usually) restarted when the host reboots. I infer from this observation that system scripts already handle this more or less correctly. I suppose that I could just create an init script that read a custom status file and restated every domain that it found therein using virsh. However, if such a beast already exists then I would rather not have to reinvent the wheel. Is anyone aware of such a script or where one might be found? I am not havig much luck with Google this morning. -- *** E-Mail is NOT a SECURE channel *** James B. Byrne mailto:ByrneJB at Harte-Lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3