Nathan Gamber
2010-Nov-05 17:30 UTC
[Xen-users] i/o scheduler deadlocks with loopback devices
This was an email I sent to xen-devel a while ago without getting a response. I''m reposting it here in case someone knows more. Hello all, I''m able to consistently reproduce lockups in my domU with heavy I/O with the following error: 36841.420662] INFO: task rsyslogd:15014 blocked for more than 120 seconds. [36841.420843] "echo 0> /proc/sys/kernel/hung_task_timeout_secs" disables this message. The task varies between any of the tasks that might be active (kjournald, loop0, etc.) My setup is: Xen dom0 version 3.4.2. domU: Ubuntu 10.04, 2.6.36-rc6 based on Stefano Stabellini''s v2.6.36-rc6-urgent-fixes tree. Paravirtual disks and network interfaces. Root filesystem on /dev/xvda3, formatted ext3, mounted with default options. Both dom0 and domU are using the CFQ i/o scheduler. The xvbd is based on LVM, on top of a local SATA RAID array. To produce this, I can do one of the following: Set up domU as a primary drbd node, with my drbd volume on top of a local loopback device, and then rsync many files to the volume, delete them, and repeat until the crash. Mount a linux iso via loopback on a /mnt/test, rsync /mnt/test/ to another directory on xvda3, delete the files, and then repeat until the crash. This is very similar to the following situation: http://www.amailbox.org/mailarchive/linux-kernel/2010/9/1/4614107 Jeremy Fitzhardinge replied to that thread, indicating that his "xen: use percpu interrupts for IPIs and VIRQs" and "xen: handle events as edge-triggered" patches should fix the issue. These were introduced into 2.6.36-rc3, I believe, and the issue persists. Disabling irqbalanced in dom0, as he suggested as a workaround, has no effect. I''ve also tried changing the scheduler, and reducing the number of vcpus from 4 to 1, which also had no effect. Regards, Nathan Gamber _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Thomas Halinka
2010-Nov-05 17:48 UTC
Re: [Xen-users] i/o scheduler deadlocks with loopback devices
Hi Natham, which kernel is ur dom0 running? cu, thomas Am Freitag, den 05.11.2010, 13:30 -0400 schrieb Nathan Gamber:> This was an email I sent to xen-devel a while ago without getting a > response. I''m reposting it here in case someone knows more. > > Hello all, > > I''m able to consistently reproduce lockups in my domU with heavy I/O > with the following error: > > 36841.420662] INFO: task rsyslogd:15014 > blocked for more than 120 seconds. [36841.420843] "echo 0> > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > The task varies between any of the tasks that might be active > (kjournald, loop0, etc.) > > My setup is: > Xen dom0 version 3.4.2. > domU: Ubuntu 10.04, 2.6.36-rc6 based on Stefano Stabellini''s > v2.6.36-rc6-urgent-fixes tree. > Paravirtual disks and network interfaces. > Root filesystem on /dev/xvda3, formatted ext3, mounted with default options. > Both dom0 and domU are using the CFQ i/o scheduler. > > The xvbd is based on LVM, on top of a local SATA RAID array. > > > To produce this, I can do one of the following: > > Set up domU as a primary drbd node, with my drbd volume on top of a > local loopback device, and then rsync many files to the volume, delete > them, and repeat until the crash. > > Mount a linux iso via loopback on a /mnt/test, rsync /mnt/test/ to > another directory on xvda3, delete the files, and then repeat until the > crash. > > This is very similar to the following situation: > > http://www.amailbox.org/mailarchive/linux-kernel/2010/9/1/4614107 > > Jeremy Fitzhardinge replied to that thread, indicating that his "xen: > use percpu interrupts for IPIs and VIRQs" and "xen: handle events as > edge-triggered" patches should fix the issue. These were introduced into > 2.6.36-rc3, I believe, and the issue persists. Disabling irqbalanced in > dom0, as he suggested as a workaround, has no effect. I''ve also tried > changing the scheduler, and reducing the number of vcpus from 4 to 1, > which also had no effect. > > Regards, > > Nathan Gamber > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Nathan Gamber
2010-Nov-05 17:56 UTC
Re: [Xen-users] i/o scheduler deadlocks with loopback devices
Thomas, 2.6.18-164.6.1.el5xen on CentOS 5. Nathan On 11/05/10 13:48, Thomas Halinka wrote:> Hi Natham, > > which kernel is ur dom0 running? > > cu, > > thomas > > > Am Freitag, den 05.11.2010, 13:30 -0400 schrieb Nathan Gamber: >> This was an email I sent to xen-devel a while ago without getting a >> response. I''m reposting it here in case someone knows more. >> >> Hello all, >> >> I''m able to consistently reproduce lockups in my domU with heavy I/O >> with the following error: >> >> 36841.420662] INFO: task rsyslogd:15014 >> blocked for more than 120 seconds. [36841.420843] "echo 0> >> /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> >> The task varies between any of the tasks that might be active >> (kjournald, loop0, etc.) >> >> My setup is: >> Xen dom0 version 3.4.2. >> domU: Ubuntu 10.04, 2.6.36-rc6 based on Stefano Stabellini''s >> v2.6.36-rc6-urgent-fixes tree. >> Paravirtual disks and network interfaces. >> Root filesystem on /dev/xvda3, formatted ext3, mounted with default options. >> Both dom0 and domU are using the CFQ i/o scheduler. >> >> The xvbd is based on LVM, on top of a local SATA RAID array. >> >> >> To produce this, I can do one of the following: >> >> Set up domU as a primary drbd node, with my drbd volume on top of a >> local loopback device, and then rsync many files to the volume, delete >> them, and repeat until the crash. >> >> Mount a linux iso via loopback on a /mnt/test, rsync /mnt/test/ to >> another directory on xvda3, delete the files, and then repeat until the >> crash. >> >> This is very similar to the following situation: >> >> http://www.amailbox.org/mailarchive/linux-kernel/2010/9/1/4614107 >> >> Jeremy Fitzhardinge replied to that thread, indicating that his "xen: >> use percpu interrupts for IPIs and VIRQs" and "xen: handle events as >> edge-triggered" patches should fix the issue. These were introduced into >> 2.6.36-rc3, I believe, and the issue persists. Disabling irqbalanced in >> dom0, as he suggested as a workaround, has no effect. I''ve also tried >> changing the scheduler, and reducing the number of vcpus from 4 to 1, >> which also had no effect. >> >> Regards, >> >> Nathan Gamber >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >> >> >> _______________________________________________ >> Xen-users mailing list >> Xen-users@lists.xensource.com >> http://lists.xensource.com/xen-users > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users