thr3ads.net - Xen users - [Xen-users] i/o scheduler deadlocks with loopback devices [Nov 2010]

If this information is useful, please help other people find it:
Share via:

Nathan Gamber

2010-Nov-05 17:30 UTC

[Xen-users] i/o scheduler deadlocks with loopback devices

This was an email I sent to xen-devel a while ago without getting a 
response. I''m reposting it here in case someone knows more.

  Hello all,

I''m able to consistently reproduce lockups in my domU with heavy I/O
with the following error:

36841.420662] INFO: task rsyslogd:15014
blocked for more than 120 seconds. [36841.420843] "echo 0>
/proc/sys/kernel/hung_task_timeout_secs" disables this message.

The task varies between any of the tasks that might be active
(kjournald, loop0, etc.)

My setup is:
Xen dom0  version 3.4.2.
domU: Ubuntu 10.04, 2.6.36-rc6 based on Stefano Stabellini''s
v2.6.36-rc6-urgent-fixes tree.
Paravirtual disks and network interfaces.
Root filesystem on /dev/xvda3, formatted ext3, mounted with default options.
Both dom0 and domU are using the CFQ i/o scheduler.

The xvbd is based on LVM, on top of a local SATA RAID array.


To produce this, I can do one of the following:

Set up domU as a primary drbd node, with my drbd volume on top of a
local loopback device, and then rsync many files to the volume, delete
them, and repeat until the crash.

Mount a linux iso via loopback on a /mnt/test, rsync /mnt/test/ to
another directory on xvda3, delete the files, and then repeat until the
crash.

This is very similar to the following situation:

http://www.amailbox.org/mailarchive/linux-kernel/2010/9/1/4614107

Jeremy Fitzhardinge replied to that thread, indicating that his "xen:
use percpu interrupts for IPIs and VIRQs" and "xen: handle events as
edge-triggered" patches should fix the issue. These were introduced into
2.6.36-rc3, I believe, and the issue persists. Disabling irqbalanced in
dom0, as he suggested as a workaround, has no effect. I''ve also tried
changing the scheduler, and reducing the number of vcpus from 4 to 1,
which also had no effect.

Regards,

Nathan Gamber

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel


_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Thomas Halinka

2010-Nov-05 17:48 UTC

head link

Re: [Xen-users] i/o scheduler deadlocks with loopback devices

Hi Natham,

which kernel is ur dom0 running?

cu,

thomas


Am Freitag, den 05.11.2010, 13:30 -0400 schrieb Nathan
Gamber:> This was an email I sent to xen-devel a while ago without getting a 
> response. I''m reposting it here in case someone knows more.
> 
>   Hello all,
> 
> I''m able to consistently reproduce lockups in my domU with heavy
I/O
> with the following error:
> 
> 36841.420662] INFO: task rsyslogd:15014
> blocked for more than 120 seconds. [36841.420843] "echo 0>
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> 
> The task varies between any of the tasks that might be active
> (kjournald, loop0, etc.)
> 
> My setup is:
> Xen dom0  version 3.4.2.
> domU: Ubuntu 10.04, 2.6.36-rc6 based on Stefano Stabellini''s
> v2.6.36-rc6-urgent-fixes tree.
> Paravirtual disks and network interfaces.
> Root filesystem on /dev/xvda3, formatted ext3, mounted with default
options.
> Both dom0 and domU are using the CFQ i/o scheduler.
> 
> The xvbd is based on LVM, on top of a local SATA RAID array.
> 
> 
> To produce this, I can do one of the following:
> 
> Set up domU as a primary drbd node, with my drbd volume on top of a
> local loopback device, and then rsync many files to the volume, delete
> them, and repeat until the crash.
> 
> Mount a linux iso via loopback on a /mnt/test, rsync /mnt/test/ to
> another directory on xvda3, delete the files, and then repeat until the
> crash.
> 
> This is very similar to the following situation:
> 
> http://www.amailbox.org/mailarchive/linux-kernel/2010/9/1/4614107
> 
> Jeremy Fitzhardinge replied to that thread, indicating that his "xen:
> use percpu interrupts for IPIs and VIRQs" and "xen: handle events
as
> edge-triggered" patches should fix the issue. These were introduced
into
> 2.6.36-rc3, I believe, and the issue persists. Disabling irqbalanced in
> dom0, as he suggested as a workaround, has no effect. I''ve also
tried
> changing the scheduler, and reducing the number of vcpus from 4 to 1,
> which also had no effect.
> 
> Regards,
> 
> Nathan Gamber
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
> 
> 
> _______________________________________________
> Xen-users mailing list
> Xen-users@lists.xensource.com
> http://lists.xensource.com/xen-users


_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Nathan Gamber

2010-Nov-05 17:56 UTC

head link

Re: [Xen-users] i/o scheduler deadlocks with loopback devices

Thomas,

2.6.18-164.6.1.el5xen on CentOS 5.


Nathan

On 11/05/10 13:48, Thomas Halinka wrote:> Hi Natham,
>
> which kernel is ur dom0 running?
>
> cu,
>
> thomas
>
>
> Am Freitag, den 05.11.2010, 13:30 -0400 schrieb Nathan Gamber:
>> This was an email I sent to xen-devel a while ago without getting a
>> response. I''m reposting it here in case someone knows more.
>>
>>    Hello all,
>>
>> I''m able to consistently reproduce lockups in my domU with
heavy I/O
>> with the following error:
>>
>> 36841.420662] INFO: task rsyslogd:15014
>> blocked for more than 120 seconds. [36841.420843] "echo 0>
>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>
>> The task varies between any of the tasks that might be active
>> (kjournald, loop0, etc.)
>>
>> My setup is:
>> Xen dom0  version 3.4.2.
>> domU: Ubuntu 10.04, 2.6.36-rc6 based on Stefano Stabellini''s
>> v2.6.36-rc6-urgent-fixes tree.
>> Paravirtual disks and network interfaces.
>> Root filesystem on /dev/xvda3, formatted ext3, mounted with default
options.
>> Both dom0 and domU are using the CFQ i/o scheduler.
>>
>> The xvbd is based on LVM, on top of a local SATA RAID array.
>>
>>
>> To produce this, I can do one of the following:
>>
>> Set up domU as a primary drbd node, with my drbd volume on top of a
>> local loopback device, and then rsync many files to the volume, delete
>> them, and repeat until the crash.
>>
>> Mount a linux iso via loopback on a /mnt/test, rsync /mnt/test/ to
>> another directory on xvda3, delete the files, and then repeat until the
>> crash.
>>
>> This is very similar to the following situation:
>>
>> http://www.amailbox.org/mailarchive/linux-kernel/2010/9/1/4614107
>>
>> Jeremy Fitzhardinge replied to that thread, indicating that his
"xen:
>> use percpu interrupts for IPIs and VIRQs" and "xen: handle
events as
>> edge-triggered" patches should fix the issue. These were
introduced into
>> 2.6.36-rc3, I believe, and the issue persists. Disabling irqbalanced in
>> dom0, as he suggested as a workaround, has no effect. I''ve
also tried
>> changing the scheduler, and reducing the number of vcpus from 4 to 1,
>> which also had no effect.
>>
>> Regards,
>>
>> Nathan Gamber
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>>
>>
>> _______________________________________________
>> Xen-users mailing list
>> Xen-users@lists.xensource.com
>> http://lists.xensource.com/xen-users
>
>
> _______________________________________________
> Xen-users mailing list
> Xen-users@lists.xensource.com
> http://lists.xensource.com/xen-users

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Maybe Matching Threads

Search for more apparently analagous threads

Xen users - Nov 2010 - i/o scheduler deadlocks with loopback devices

[Xen-users] i/o scheduler deadlocks with loopback devices

Re: [Xen-users] i/o scheduler deadlocks with loopback devices

Re: [Xen-users] i/o scheduler deadlocks with loopback devices

Maybe Matching Threads