Steven Ellis
2010-Mar-10 22:28 UTC
[CentOS-virt] Logrotate/cron and major I/O contention with KVM.
Is anyone else having major I/O peaks due to logrotate or other jobs running simultaneously across multiple guests. I have one KVM server running Centos 5.4 with local disk that is seriously suffering as most of the guests rotate their syslog at the same time. Looking at the KVM server I'm seeing 11:00:01 PM CPU %user %nice %system %iowait %steal %idle 03:40:01 AM all 0.07 0.00 2.74 0.93 0.00 96.26 03:50:01 AM all 0.07 0.00 1.17 1.18 0.00 97.58 04:00:01 AM all 0.08 0.00 1.51 0.82 0.00 97.59 04:10:02 AM all 0.53 0.03 15.31 51.61 0.00 32.53 04:20:01 AM all 0.28 0.12 4.12 22.21 0.00 73.27 04:30:01 AM all 0.07 0.00 0.80 1.21 0.00 97.92 04:40:01 AM all 0.07 0.00 2.60 1.81 0.00 95.52 04:50:01 AM all 0.08 0.00 0.79 1.44 0.00 97.69 On one of the guests running Centos 4.6 the impact is so bad I get DMA timeout errors in the syslog, and occasional kernel panics. Mar 11 04:05:04 localhost kernel: hda: dma_timer_expiry: dma status == 0x21 Mar 11 04:05:14 localhost kernel: hda: DMA timeout error Mar 11 04:05:14 localhost kernel: hda: dma timeout error: status=0x50 { DriveReady SeekComplete } Mar 11 04:05:14 localhost kernel: Mar 11 04:05:14 localhost kernel: ide: failed opcode was: unknown Mar 11 04:05:59 localhost kernel: hda: dma_timer_expiry: dma status == 0x21 Mar 11 04:06:14 localhost kernel: hda: DMA timeout error Mar 11 04:06:14 localhost kernel: hda: dma timeout error: status=0x50 { DriveReady SeekComplete } One reference I've found is at * http://lonesysadmin.net/linux-virtual-machine-tuning-guide/ This suggests avoiding running scheduled jobs simultaneously across guests, and suggests using a random sleep. Does anyone else have suggestions on reducing the impact of cron/logrotate. -- *Steven Ellis - Director of Worldwide Engineering,* *Bulletin.Net Inc* - http://www.bulletin.net/ -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20100311/35f20b44/attachment-0006.html>
Benjamin Franz
2010-Mar-10 22:36 UTC
[CentOS-virt] Logrotate/cron and major I/O contention with KVM.
Steven Ellis wrote:> > > This suggests avoiding running scheduled jobs simultaneously across > guests, and suggests using a random sleep. > > Does anyone else have suggestions on reducing the impact of > cron/logrotate.Setup a syslog server and have all your machines send their logging there instead of keeping them locally on each machine. -- Benjamin Franz
Mathew S. McCarrell
2010-Mar-11 19:02 UTC
[CentOS-virt] Logrotate/cron and major I/O contention with KVM.
On Wed, Mar 10, 2010 at 5:28 PM, Steven Ellis <steven.ellis at bulletin.net>wrote:> > Is anyone else having major I/O peaks due to logrotate or other jobs > running simultaneously across multiple guests. I have one KVM server running > Centos 5.4 with local disk that is seriously suffering as most of the guests > rotate their syslog at the same time. > > Looking at the KVM server I'm seeing > > 11:00:01 PM CPU %user %nice %system %iowait %steal > %idle > 03:40:01 AM all 0.07 0.00 2.74 0.93 0.00 > 96.26 > 03:50:01 AM all 0.07 0.00 1.17 1.18 0.00 > 97.58 > 04:00:01 AM all 0.08 0.00 1.51 0.82 0.00 > 97.59 > 04:10:02 AM all 0.53 0.03 15.31 51.61 0.00 > 32.53 > 04:20:01 AM all 0.28 0.12 4.12 22.21 0.00 > 73.27 > 04:30:01 AM all 0.07 0.00 0.80 1.21 0.00 > 97.92 > 04:40:01 AM all 0.07 0.00 2.60 1.81 0.00 > 95.52 > 04:50:01 AM all 0.08 0.00 0.79 1.44 0.00 > 97.69 > > On one of the guests running Centos 4.6 the impact is so bad I get DMA > timeout errors in the syslog, and occasional kernel panics. > > Mar 11 04:05:04 localhost kernel: hda: dma_timer_expiry: dma status == 0x21 > Mar 11 04:05:14 localhost kernel: hda: DMA timeout error > Mar 11 04:05:14 localhost kernel: hda: dma timeout error: status=0x50 { > DriveReady SeekComplete } > Mar 11 04:05:14 localhost kernel: > Mar 11 04:05:14 localhost kernel: ide: failed opcode was: unknown > Mar 11 04:05:59 localhost kernel: hda: dma_timer_expiry: dma status == 0x21 > Mar 11 04:06:14 localhost kernel: hda: DMA timeout error > Mar 11 04:06:14 localhost kernel: hda: dma timeout error: status=0x50 { > DriveReady SeekComplete } > > > One reference I've found is at > * http://lonesysadmin.net/linux-virtual-machine-tuning-guide/ > > This suggests avoiding running scheduled jobs simultaneously across guests, > and suggests using a random sleep. > > Does anyone else have suggestions on reducing the impact of cron/logrotate. > >I ran into this issue as well on a box running Xen with local storage. My solution was to modify /etc/crontab to run /etc/cron.weekly at different times for each guest and for the dom0. I modified the entry on each VM to be 10 minutes after the previous one and have not seen any load spikes since then. Matt -- Mathew S. McCarrell Clarkson University '10 mccarrms at gmail.com mccarrms at clarkson.edu 1-518-314-9214 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20100311/b5ebd72d/attachment-0006.html>
Ray Van Dolson
2010-Mar-11 19:09 UTC
[CentOS-virt] Logrotate/cron and major I/O contention with KVM.
On Thu, Mar 11, 2010 at 11:02:33AM -0800, Mathew S. McCarrell wrote:> Is anyone else having major I/O peaks due to logrotate or other jobs > running simultaneously across multiple guests. I have one KVM server > running Centos 5.4 with local disk that is seriously suffering as > most of the guests rotate their syslog at the same time. > > Looking at the KVM server I'm seeing > > 11:00:01 PM CPU %user %nice %system %iowait %steal %idle > 03:40:01 AM all 0.07 0.00 2.74 0.93 0.00 96.26 > 03:50:01 AM all 0.07 0.00 1.17 1.18 0.00 97.58 > 04:00:01 AM all 0.08 0.00 1.51 0.82 0.00 97.59 > 04:10:02 AM all 0.53 0.03 15.31 51.61 0.00 32.53 > 04:20:01 AM all 0.28 0.12 4.12 22.21 0.00 73.27 > 04:30:01 AM all 0.07 0.00 0.80 1.21 0.00 97.92 > 04:40:01 AM all 0.07 0.00 2.60 1.81 0.00 95.52 > 04:50:01 AM all 0.08 0.00 0.79 1.44 0.00 97.69 > > On one of the guests running Centos 4.6 the impact is so bad I get > DMA timeout errors in the syslog, and occasional kernel panics. > > Mar 11 04:05:04 localhost kernel: hda: dma_timer_expiry: dma status == 0x21 > Mar 11 04:05:14 localhost kernel: hda: DMA timeout error > Mar 11 04:05:14 localhost kernel: hda: dma timeout error: status=0x50 { DriveReady SeekComplete } > Mar 11 04:05:14 localhost kernel: > Mar 11 04:05:14 localhost kernel: ide: failed opcode was: unknown > Mar 11 04:05:59 localhost kernel: hda: dma_timer_expiry: dma status == 0x21 > Mar 11 04:06:14 localhost kernel: hda: DMA timeout error > Mar 11 04:06:14 localhost kernel: hda: dma timeout error: status=0x50 { DriveReady SeekComplete } > > One reference I've found is at > * http://lonesysadmin.net/linux-virtual-machine-tuning-guide/ > > This suggests avoiding running scheduled jobs simultaneously across > guests, and suggests using a random sleep.I think this is a pretty good suggestion.> > Does anyone else have suggestions on reducing the impact of > cron/logrotate.You might also consider increasing the device timeouts on your block devices at the guest level: echo 120 > /sys/block/sda/device/timeout etc, etc. That or increase the performance of your storage :)> > I ran into this issue as well on a box running Xen with local > storage. > > My solution was to modify /etc/crontab to run /etc/cron.weekly at > different times for each guest and for the dom0. I modified the > entry on each VM to be 10 minutes after the previous one and have not > seen any load spikes since then. > > MattRay