Joshua West
2010-Dec-15 23:03 UTC
[Xen-users] RHEL6 domU migrate issues w/ higher to lower frequency CPU''s
Hey folks, I''ve encountered a rather interesting/frustrating issue with RHEL6 domU''s and live migration. I have no problems booting a RHEL6 domU using its stock/native kernel on Xen 3.4.1 or Xen 3.4.3. But in terms of live migration, there seems to be a problem when moving from a higher (in terms of CPU MHz) to lower (MHz) system -- even if the higher of the two is a much older CPU model. For example, I can reproduce the bug under Xen 3.4.3 with the following: * Migrating from X5450 @ 3.00GHz to X5355 @ 2.66GHz fails, but the opposite (increasing in CPU frequency) succeeds. * Migrating from Xeon(TM) CPU 2.80GHz to E5310 @ 1.60GHz fails, but the opposite (increasing in CPU frequency) succeeds. BTW, when I say "fails", what I really mean is the migration succeeds but the domU is no longer responsive. I can attach to the console via ''xm console'' but nothing is displayed, although occasionally a new line is printed as i bang my hands on the Enter key. Occasionally ping works and occasionally I can establish a connection to the domU''s port 22 and see the OpenSSH banner, but thats as far as I get. Its not like the domU is runaway with 100% cpu. It sits with state "-b----" (xm list). I have tested cpuid masking but this doesn''t help. Its an issue with going from a higher CPU frequency system to a lower CPU frequency system. This is using the stock RHEL6 kernel ''vmlinuz-2.6.32-71.7.1.el6.x86_64''. Anybody have suggestions on the cause or a workaround? Experience this issue too? I''ve heard through the grapevine that this bug is also confirmed with RHEL6 domU''s on XCP 1.0. Thanks for any help you can provide! -- Joshua West Senior Systems Engineer Brandeis University http://www.brandeis.edu _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Paras pradhan
2010-Dec-16 15:40 UTC
Re: [Xen-users] RHEL6 domU migrate issues w/ higher to lower frequency CPU''s
I am having exactly the same problem. Plus this bug is seen when you have the same cpu in both nodes too. I have tested in Quad-Core AMD Opteron(tm) Processor 8374 HE in both nodes and same problem. Symptoms: * Start the node in node1. No problem * Live migrate to Node2 (same cpu, 100% same hardware). domU unresponsive, no output in xm console, can ping ,can see ssh banner but dead slow * Migrate back in node1, No problem I could reproduce this problem in another set of systems too. Same problem. I have a ticket opened with Redhat from around 2 weeks. no help till now. Also I have seen this bug filed at Redhat but dunno if they are working on it or not. https://bugzilla.redhat.com/show_bug.cgi?id=613513 This is really frustrating. Paras. On Wed, Dec 15, 2010 at 5:03 PM, Joshua West <jwest@brandeis.edu> wrote:> Hey folks, > > I''ve encountered a rather interesting/frustrating issue with RHEL6 domU''s > and live migration. > > I have no problems booting a RHEL6 domU using its stock/native kernel on Xen > 3.4.1 or Xen 3.4.3. But in terms of live migration, there seems to be a > problem when moving from a higher (in terms of CPU MHz) to lower (MHz) > system -- even if the higher of the two is a much older CPU model. > > For example, I can reproduce the bug under Xen 3.4.3 with the following: > > * Migrating from X5450 @ 3.00GHz to X5355 @ 2.66GHz fails, but the opposite > (increasing in CPU frequency) succeeds. > * Migrating from Xeon(TM) CPU 2.80GHz to E5310 @ 1.60GHz fails, but the > opposite (increasing in CPU frequency) succeeds. > > BTW, when I say "fails", what I really mean is the migration succeeds but > the domU is no longer responsive. I can attach to the console via ''xm > console'' but nothing is displayed, although occasionally a new line is > printed as i bang my hands on the Enter key. Occasionally ping works and > occasionally I can establish a connection to the domU''s port 22 and see the > OpenSSH banner, but thats as far as I get. Its not like the domU is runaway > with 100% cpu. It sits with state "-b----" (xm list). > > I have tested cpuid masking but this doesn''t help. Its an issue with going > from a higher CPU frequency system to a lower CPU frequency system. > > This is using the stock RHEL6 kernel ''vmlinuz-2.6.32-71.7.1.el6.x86_64''. > > Anybody have suggestions on the cause or a workaround? Experience this > issue too? > > I''ve heard through the grapevine that this bug is also confirmed with RHEL6 > domU''s on XCP 1.0. > > Thanks for any help you can provide! > > -- > Joshua West > Senior Systems Engineer > Brandeis University > http://www.brandeis.edu > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Joshua West
2010-Dec-16 18:58 UTC
Re: [Xen-users] RHEL6 domU migrate issues w/ higher to lower frequency CPU''s
Hi Paras, In what way does your migration fail? Does it succeed but then the virtual machine is then completely unresponsive, including the console? Or does the migration just error/exit out? I''ve heard reports of problems like this with the stock RHEL 5.x Xen software, regardless of CPU MHz clock speed, having issues. In my case, it only seems to be a problem when moving from servers with higher/faster CPU MHz clock speeds to those with lower. Also, it now looks like the console finally becomes responsive after about 5-10 minutes of being hung up. I''ve opened a case with Red Hat as well: https://bugzilla.redhat.com/show_bug.cgi?id=663755 I''m interested to see if this is the same bug you and I are experiencing or if its two different issues, as I''m running a later version of Xen. Thanks for the input! On 12/16/10 10:40, Paras pradhan wrote:> I am having exactly the same problem. Plus this bug is seen when you > have the same cpu in both nodes too. I have tested in Quad-Core AMD > Opteron(tm) Processor 8374 HE in both nodes and same problem. > > Symptoms: > > * Start the node in node1. No problem > * Live migrate to Node2 (same cpu, 100% same hardware). domU > unresponsive, no output in xm console, can ping ,can see ssh banner > but dead slow > * Migrate back in node1, No problem > > I could reproduce this problem in another set of systems too. Same problem. > > I have a ticket opened with Redhat from around 2 weeks. no help till now. > > Also I have seen this bug filed at Redhat but dunno if they are > working on it or not. > https://bugzilla.redhat.com/show_bug.cgi?id=613513 > > This is really frustrating. > > Paras. > > > On Wed, Dec 15, 2010 at 5:03 PM, Joshua West<jwest@brandeis.edu> wrote: >> Hey folks, >> >> I''ve encountered a rather interesting/frustrating issue with RHEL6 domU''s >> and live migration. >> >> I have no problems booting a RHEL6 domU using its stock/native kernel on Xen >> 3.4.1 or Xen 3.4.3. But in terms of live migration, there seems to be a >> problem when moving from a higher (in terms of CPU MHz) to lower (MHz) >> system -- even if the higher of the two is a much older CPU model. >> >> For example, I can reproduce the bug under Xen 3.4.3 with the following: >> >> * Migrating from X5450 @ 3.00GHz to X5355 @ 2.66GHz fails, but the opposite >> (increasing in CPU frequency) succeeds. >> * Migrating from Xeon(TM) CPU 2.80GHz to E5310 @ 1.60GHz fails, but the >> opposite (increasing in CPU frequency) succeeds. >> >> BTW, when I say "fails", what I really mean is the migration succeeds but >> the domU is no longer responsive. I can attach to the console via ''xm >> console'' but nothing is displayed, although occasionally a new line is >> printed as i bang my hands on the Enter key. Occasionally ping works and >> occasionally I can establish a connection to the domU''s port 22 and see the >> OpenSSH banner, but thats as far as I get. Its not like the domU is runaway >> with 100% cpu. It sits with state "-b----" (xm list). >> >> I have tested cpuid masking but this doesn''t help. Its an issue with going >> from a higher CPU frequency system to a lower CPU frequency system. >> >> This is using the stock RHEL6 kernel ''vmlinuz-2.6.32-71.7.1.el6.x86_64''. >> >> Anybody have suggestions on the cause or a workaround? Experience this >> issue too? >> >> I''ve heard through the grapevine that this bug is also confirmed with RHEL6 >> domU''s on XCP 1.0. >> >> Thanks for any help you can provide! >> >> -- >> Joshua West >> Senior Systems Engineer >> Brandeis University >> http://www.brandeis.edu >> >> >> _______________________________________________ >> Xen-users mailing list >> Xen-users@lists.xensource.com >> http://lists.xensource.com/xen-users >> > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users-- Joshua West Senior Systems Engineer Brandeis University http://www.brandeis.edu _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Paras pradhan
2010-Dec-16 21:19 UTC
Re: [Xen-users] RHEL6 domU migrate issues w/ higher to lower frequency CPU''s
On Thu, Dec 16, 2010 at 12:58 PM, Joshua West <jwest@brandeis.edu> wrote:> Hi Paras, > > In what way does your migration fail? Does it succeed but then the virtual > machine is then completely unresponsive, including the console? Or does the > migration just error/exit out?Yes. Migration is successful each time I migrate. No error. Xen dom0s logs looks normal too.> > I''ve heard reports of problems like this with the stock RHEL 5.x Xen > software, regardless of CPU MHz clock speed, having issues. In my case, it > only seems to be a problem when moving from servers with higher/faster CPU > MHz clock speeds to those with lower. Also, it now looks like the console > finally becomes responsive after about 5-10 minutes of being hung up.Well I have similar CPUs on all the nodes.> > I''ve opened a case with Red Hat as well: > https://bugzilla.redhat.com/show_bug.cgi?id=663755Great ...> > I''m interested to see if this is the same bug you and I are experiencing or > if its two different issues, as I''m running a later version of Xen.When I created a case with Redhat , they told me they are not able to reproduce the issue. I have provided all my setup config files to them and now I am waiting from a week. But I am sure this has to do with redhat 6 domU kernel. No problem with redhat 5 domUs. Hope we will find the solution soon. Thanks! Paras.> > Thanks for the input! > > On 12/16/10 10:40, Paras pradhan wrote: >> >> I am having exactly the same problem. Plus this bug is seen when you >> have the same cpu in both nodes too. I have tested in Quad-Core AMD >> Opteron(tm) Processor 8374 HE in both nodes and same problem. >> >> Symptoms: >> >> * Start the node in node1. No problem >> * Live migrate to Node2 (same cpu, 100% same hardware). domU >> unresponsive, no output in xm console, can ping ,can see ssh banner >> but dead slow >> * Migrate back in node1, No problem >> >> I could reproduce this problem in another set of systems too. Same >> problem. >> >> I have a ticket opened with Redhat from around 2 weeks. no help till now. >> >> Also I have seen this bug filed at Redhat but dunno if they are >> working on it or not. >> https://bugzilla.redhat.com/show_bug.cgi?id=613513 >> >> This is really frustrating. >> >> Paras. >> >> >> On Wed, Dec 15, 2010 at 5:03 PM, Joshua West<jwest@brandeis.edu> wrote: >>> >>> Hey folks, >>> >>> I''ve encountered a rather interesting/frustrating issue with RHEL6 domU''s >>> and live migration. >>> >>> I have no problems booting a RHEL6 domU using its stock/native kernel on >>> Xen >>> 3.4.1 or Xen 3.4.3. But in terms of live migration, there seems to be a >>> problem when moving from a higher (in terms of CPU MHz) to lower (MHz) >>> system -- even if the higher of the two is a much older CPU model. >>> >>> For example, I can reproduce the bug under Xen 3.4.3 with the following: >>> >>> * Migrating from X5450 @ 3.00GHz to X5355 @ 2.66GHz fails, but the >>> opposite >>> (increasing in CPU frequency) succeeds. >>> * Migrating from Xeon(TM) CPU 2.80GHz to E5310 @ 1.60GHz fails, but the >>> opposite (increasing in CPU frequency) succeeds. >>> >>> BTW, when I say "fails", what I really mean is the migration succeeds but >>> the domU is no longer responsive. I can attach to the console via ''xm >>> console'' but nothing is displayed, although occasionally a new line is >>> printed as i bang my hands on the Enter key. Occasionally ping works and >>> occasionally I can establish a connection to the domU''s port 22 and see >>> the >>> OpenSSH banner, but thats as far as I get. Its not like the domU is >>> runaway >>> with 100% cpu. It sits with state "-b----" (xm list). >>> >>> I have tested cpuid masking but this doesn''t help. Its an issue with >>> going >>> from a higher CPU frequency system to a lower CPU frequency system. >>> >>> This is using the stock RHEL6 kernel ''vmlinuz-2.6.32-71.7.1.el6.x86_64''. >>> >>> Anybody have suggestions on the cause or a workaround? Experience this >>> issue too? >>> >>> I''ve heard through the grapevine that this bug is also confirmed with >>> RHEL6 >>> domU''s on XCP 1.0. >>> >>> Thanks for any help you can provide! >>> >>> -- >>> Joshua West >>> Senior Systems Engineer >>> Brandeis University >>> http://www.brandeis.edu >>> >>> >>> _______________________________________________ >>> Xen-users mailing list >>> Xen-users@lists.xensource.com >>> http://lists.xensource.com/xen-users >>> >> _______________________________________________ >> Xen-users mailing list >> Xen-users@lists.xensource.com >> http://lists.xensource.com/xen-users > > > -- > Joshua West > Senior Systems Engineer > Brandeis University > http://www.brandeis.edu > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Philipp Hahn
2011-Jan-19 12:59 UTC
Re: [Xen-users] RHEL6 domU migrate issues w/ higher to lower frequency CPU''s
Hello, Am Donnerstag 16 Dezember 2010 00:03:07 schrieb Joshua West:> I''ve encountered a rather interesting/frustrating issue with RHEL6 > domU''s and live migration. > > I have no problems booting a RHEL6 domU using its stock/native kernel on > Xen 3.4.1 or Xen 3.4.3. But in terms of live migration, there seems to > be a problem when moving from a higher (in terms of CPU MHz) to lower > (MHz) system -- even if the higher of the two is a much older CPU model....> BTW, when I say "fails", what I really mean is the migration succeeds > but the domU is no longer responsive. I can attach to the console via > ''xm console'' but nothing is displayed, although occasionally a new line > is printed as i bang my hands on the Enter key. Occasionally ping works > and occasionally I can establish a connection to the domU''s port 22 and > see the OpenSSH banner, but thats as far as I get. Its not like the > domU is runaway with 100% cpu. It sits with state "-b----" (xm list).I encountered a very similar problem with our Debian based distribution: ping often works, ssh-login is broken after login, xm console doesn''t accept input, but prints kernel messages. I think its relates to a Bug in the pvclock driver in the domU-Kernel (my kernel is ~2.6.32-24, but I''ve also not seen it fixed in 2.6.32-28), which also happens when migrating between two hosts with different uptime. You might want to check is your domU-kernel contains the fix from <http://lists.xensource.com/archives/html/xen-devel/2010-10/msg01261.html>. At least that patch seems to have solved my problem. I found this very detailed bug report, which explains the problem: <http://www.linux-archive.org/debian-kernel/447443-bug-602273-linux-image-2-6-32-5-686-bigmem-domu-hangs-during-dom0-reboot-recovers-when-dom0-uptime-caught-up.html> Sincerely Philipp -- Philipp Hahn Open Source Software Engineer hahn@univention.de Univention GmbH Linux for Your Business fon: +49 421 22 232- 0 Mary-Somerville-Str.1 28359 Bremen fax: +49 421 22 232-99 http://www.univention.de/ _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users