thr3ads.net - CentOS virt - [CentOS-virt] Time [Jan 2013]

If this information is useful, please help other people find it:
Share via:

James B. Byrne

2013-Jan-02 14:41 UTC

[CentOS-virt] Time

On Wed, January 2, 2013 03:53, Robert Dinse wrote:>
>      Friday, I moved our servers to a new co-lo facility and ran into
> an interesting problem with virtual machines.
>
>      I did an orderly shutdown of the CentOS 6.3 host, and it in
> turn suspends all the guests.  It took about an hour and a half
> to move and fire up the host.
>
>      The guests, being suspended, were then an hour and a half
> behind and it seems ntpd does not want to correct more than 1000
> seconds of error so it would not automatically adjust the clocks.
>
>      I tried the -g argument which is supposed to override the
> 1000 second limit but it did not.  I ended up having to manually
> set the clocks close enough for ntpd to correct.
>
>      Since there is no hardware clock for the virtual machines
> to use when they boot, it seems that shutdown and reboot of the
> virtual machines probably would not have avoided this.
>
>      Any suggestions for addressing this particular scenerio other
> than having to manually set a bunch of clocks?
>

I ran into this situation several times whilst testing KVM and the
lessons I learned from the experiences can be summarized as:

1.  Never allow the kvm hypervisior to handle guests during a host
shutdown.  Use 'virsh shutdown' on each of the guests first and then
shutdown the host.  Use autostart to restart guests on a host's
reboot.  Write a script to process 'virsh list' to feed active domains
to 'virsh shutdown' if automation is required and link that to
/etc/rc0.d/K10<whatever>.

2.  In the situation where a kvm guest pause and restore sequence
leads to an excessive disconnect between guest time and wall time use
ntpd -q to hard set the time.  From the guest's point of view you are
always going ahead in time in the case of a pause and resume so this
is not likely to ever cause a problem. But, having written that down,
it probably will at some point.

3.  Run ntpd on the host system and have its guests configured to only
use that time server source.

4.  On each guest have a cron job that checks for ntpd at regular
intervals which reports failures and restarts the time service as
necessary. We use:
  JOBNAME="Check ntpd status and restart if required" ; \
    ntpstat > /dev/null && \
    if [[ $? -gt 0 ]]; then /sbin/service ntpd start; fi

-- 
***          E-Mail is NOT a SECURE channel          ***
James B. Byrne                mailto:ByrneJB at Harte-Lyne.ca
Harte & Lyne Limited          http://www.harte-lyne.ca
9 Brockley Drive              vox: +1 905 561 1241
Hamilton, Ontario             fax: +1 905 561 0757
Canada  L8E 3C3

SilverTip257

2013-Jan-02 17:51 UTC

head link

[CentOS-virt] Time

On Wed, Jan 2, 2013 at 9:41 AM, James B. Byrne <byrnejb at
harte-lyne.ca>wrote:
>
> I ran into this situation several times whilst testing KVM and the
> lessons I learned from the experiences can be summarized as:
>
> 1.  Never allow the kvm hypervisior to handle guests during a host
> shutdown.  Use 'virsh shutdown' on each of the guests first and
then
> shutdown the host.  Use autostart to restart guests on a host's
> reboot.  Write a script to process 'virsh list' to feed active
domains
> to 'virsh shutdown' if automation is required and link that to
> /etc/rc0.d/K10<whatever>.
>
>@James:  Can you specifically cite why you manually power down each node?
Have you tried tweaking your libvirt settings in the config file I noted in
my earlier response to Robert?

Oh and another note, you can set libvirt so that it _only_ starts the
machines that were running when the host machine was issued a shutdown.
 There are advantages and disadvantages to both (auto-shutdown by host and
manually doing it) as there is with anything.

> 2.  In the situation where a kvm guest pause and restore sequence
> leads to an excessive disconnect between guest time and wall time use
> ntpd -q to hard set the time.  From the guest's point of view you are
> always going ahead in time in the case of a pause and resume so this
> is not likely to ever cause a problem. But, having written that down,
> it probably will at some point.
>
> 3.  Run ntpd on the host system and have its guests configured to only
> use that time server source.
>
>Set up a central NTP server and have your hosts (and not just VMs) connect
to it.  It could be the VM host, but doesn't need to be.
Distribute the load to your NTP server and off of the public NTP pool by
running an NTP server for your servers to poll [0] ... it's a good practice
and everybody is happy.

[0]
http://support.ntp.org/bin/view/Support/DesigningYourNTPNetwork#Section_5.7.

> 4.  On each guest have a cron job that checks for ntpd at regular
> intervals which reports failures and restarts the time service as
> necessary. We use:
>   JOBNAME="Check ntpd status and restart if required" ; \
>     ntpstat > /dev/null && \
>     if [[ $? -gt 0 ]]; then /sbin/service ntpd start; fi
>
>Why not configure the ntpd daemon and stick with that?
It does update on its own [1].
And ntpstat prints out the interval, which matches the one mentioned at [1].
I don't believe the ntpstat script/job is necessary (I've never had to
do
more than set ntpd to run after configuring the servers it should poll).

~]$ ntpstat
synchronised to NTP server (x.x.x.x) at stratum 3
   time correct to within 67 ms
   polling server every 1024 s

[1] http://www.ntp.org/ntpfaq/NTP-s-algo.htm#Q-ALGO-POLL-BEST

> --
> ***          E-Mail is NOT a SECURE channel          ***
> James B. Byrne                mailto:ByrneJB at Harte-Lyne.ca
> Harte & Lyne Limited          http://www.harte-lyne.ca
> 9 Brockley Drive              vox: +1 905 561 1241
> Hamilton, Ontario             fax: +1 905 561 0757
> Canada  L8E 3C3
>
> _______________________________________________
> CentOS-virt mailing list
> CentOS-virt at centos.org
> http://lists.centos.org/mailman/listinfo/centos-virt
>

-- 
---~~.~~---
Mike
//  SilverTip257  //
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.centos.org/pipermail/centos-virt/attachments/20130102/e33eecd5/attachment-0006.html>

Robert Dinse

2013-Jan-02 19:22 UTC

head link

[CentOS-virt] Time

I thank everyone for their input.  I prefer to have the guests suspend
rather than shutdown because many of my customers start things up manually and
get annoyed when they come back to find them not running.  Most of the time
it is for a simple reboot of the host to make a new kernel or some other update
active.  This particular time though I had to move all the equipment to a new
co-location facility because the old provider had become far too greedy.

     So I guess rdate -s is probably the best solution.

-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
 Eskimo North Linux Friendly Internet Access, Shell Accounts, and Hosting.
   Knowledgeable human assistance, not telephone trees or script readers.
 See our web site: http://www.eskimo.com/ (206) 812-0051 or (800) 246-6874.

James B. Byrne

2013-Jan-03 00:27 UTC

head link

[CentOS-virt] Time

On Wed, January 2, 2013 12:51, SilverTip257 wrote:> On Wed, Jan 2, 2013 at 9:41 AM, James B. Byrne
> <byrnejb at harte-lyne.ca>wrote:
>
>>
>> I ran into this situation several times whilst testing KVM and the
>> lessons I learned from the experiences can be summarized as:
>>
>> 1.  Never allow the kvm hypervisior to handle guests during a host
>> shutdown.  Use 'virsh shutdown' on each of the guests first and
then
>> shutdown the host.  Use autostart to restart guests on a host's
>> reboot.  Write a script to process 'virsh list' to feed active
>> domains
>> to 'virsh shutdown' if automation is required and link that to
>> /etc/rc0.d/K10<whatever>.
>>
>>
> @James:  Can you specifically cite why you manually power down each
> node? Have you tried tweaking your libvirt settings in the config
> file I noted in my earlier response to Robert?
Two reasons.  First, I am minimally familiar with kvm. The niceties of
the options for it is beyond my kin for the nonce.  Second, libvirt
does not always work.  I have had guests refuse to either suspend or
shutdown from an automatic request to do so.  When shutdown is done
manually one discovers right away that there is a problem and which
guest is causing it.
> Set up a central NTP server and have your hosts (and not just VMs)
> connect to it.  It could be the VM host, but doesn't need to be.
> Distribute the load to your NTP server and off of the public NTP pool
> by running an NTP server for your servers to poll [0] ... it's a good
> practice and everybody is happy.
>
I do that as well.  However, I run one on each host just to serve its
own guests and configure the host to run off our central ntp server.
>
>
>> 4.  On each guest have a cron job that checks for ntpd at regular
>> intervals which reports failures and restarts the time service as
>> necessary. We use:
>>   JOBNAME="Check ntpd status and restart if required" ; \
>>     ntpstat > /dev/null && \
>>     if [[ $? -gt 0 ]]; then /sbin/service ntpd start; fi
>>
>>
> Why not configure the ntpd daemon and stick with that?
> It does update on its own [1]. And ntpstat prints out the interval,
> which matches the one mentioned at [1].
> I don't believe the ntpstat script/job is necessary (I've never had
to
> do more than set ntpd to run after configuring the servers it should
> poll).
>
You misunderstand the purpose of the job.  Netstat checks to see if
the daemon is actually running.  If it is not then netstat returns a
non-zero exit code. If the ntpstat exit code is not zero then the
service script is invoked to restart it.  Additionally, netstat writes
out to stderr that it could not find the daemon which gets emailed to
support. I probably should have used [[ ! $? -eq 0 ]] but what I have
written does work.

We found ntpd just stoped on some guests upon occasion without any
visible trace of a cause.  Not frequently but when it did happen it
was a nuisance to detect before clock drift on the guest caused some
failure or other.  This job detects these occurrences and self
corrects.

These are all CentOS-6.3 hosts and guests.

-- 
***          E-Mail is NOT a SECURE channel          ***
James B. Byrne                mailto:ByrneJB at Harte-Lyne.ca
Harte & Lyne Limited          http://www.harte-lyne.ca
9 Brockley Drive              vox: +1 905 561 1241
Hamilton, Ontario             fax: +1 905 561 0757
Canada  L8E 3C3

Maybe Matching Threads

Search for more apparently analagous threads

CentOS virt - Jan 2013 - Time

[CentOS-virt] Time

[CentOS-virt] Time

[CentOS-virt] Time

[CentOS-virt] Time

Maybe Matching Threads