Alessandro Grassi
2011-Nov-14 10:48 UTC
[Xen-users] Random I/O deadlocks in multiple clouds
Greetings, I have two clouds, one running XCP 0.5 and another one running XCP 1.0. Since a few weeks i''m having problems on both of them: At one pseudo-random moment one or more of the domUs write this on the console: INFO: task [random program] blocked for more than 120 seconds. The log instead is filled with stacktraces: http://pastebin.com/ziyyWEXP>From then on, the VM becomes extremely lagged if not at all unreachable.The trace suggests it''s an I/O problem, but the crashes don''t seem to follow a pattern: they happen during high as well as low I/O traffic, high/low cpu load, high/low memory usage. The same thing happens on all dom0s of both my clouds. The domUs are all running PVOPS enabled kernels (2.6.32+), in a mix of vanilla+grsec, debian stock and debian backports (lenny/squeeze). I''m keeping the dom0s under monitoring, but nothing specific seems to happen during the domU crashes - nothing in xe host-dmesg, nothing in the graphs. At this point i''m quite lost, i have no idea how to further debug the issue. Does anyone have any suggestions? Thank you in advance Sincerely, -- Alessandro _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Alessandro Grassi
2011-Nov-14 13:48 UTC
Re: [Xen-users] Random I/O deadlocks in multiple clouds
On Mon, 2011-11-14 at 11:48 +0100, Alessandro Grassi wrote:> INFO: task [random program] blocked for more than 120 seconds.[..]> The trace suggests it''s an I/O problem, but the crashes don''t seem to > follow a pattern: they happen during high as well as low I/O traffic, > high/low cpu load, high/low memory usage.[..]> The domUs are all running PVOPS enabled kernels (2.6.32+), in a mix of > vanilla+grsec, debian stock and debian backports (lenny/squeeze).I forgot to add a thing that might be useful for what i''m reading on old discussions. All the machines i run use XFS. Best regards, Alessandro -- Alessandro Grassi - System Manager at Devise.It S.r.l. Tel: +39 0574870600 | Fax: +39 0574870601 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Mon, 2011-11-14 at 14:48 +0100, Alessandro Grassi wrote:> On Mon, 2011-11-14 at 11:48 +0100, Alessandro Grassi wrote: > > INFO: task [random program] blocked for more than 120 seconds. > [..] > > The trace suggests it''s an I/O problem, but the crashes don''t seem to > > follow a pattern: they happen during high as well as low I/O traffic, > > high/low cpu load, high/low memory usage. > [..] > > The domUs are all running PVOPS enabled kernels (2.6.32+), in a mix of > > vanilla+grsec, debian stock and debian backports (lenny/squeeze). > > I forgot to add a thing that might be useful for what i''m reading on old > discussions. > All the machines i run use XFS.To all those who might be curious about this issue: The file system has nothing to do with the issue. The problem seems to have vanished on virtual machines running kernels from 2.6.39 up. Best regards, Alessandro -- Alessandro Grassi - System Manager at Devise.It S.r.l. Tel: +39 0574870600 | Fax: +39 0574870601