There are various differences between x86 CPU types that I believe would cause a guest to fail after being migrated. Are there checks in the migration code to prevent this from happening? Does it check for an "incompatible CPU" and fail early, leaving the guest running on the source host? If there is a check, what is its nature? (Exact match of CPU type/rev or something based on CPU-features?) Thanks, John Byrne _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
John Byrne wrote:>There are various differences between x86 CPU types that I believewould cause a guest to fail after being migrated. Are there >checks in the migration code to prevent this from happening? Does it check for an "incompatible CPU" and fail early, leaving the >guest running on the source host? If there is a check, what is its nature? (Exact match of CPU type/rev or something based on >CPU-features?)>Thanks,>John ByrneNo. no checks for CPU type in migration code. And since the RAM image itself is being copied, there are certainly different Problems created. But it''s even more complicated than that: the question of whether or not the migrated guest will fail or not depends on the CURRENT RAM image, and it may succeed at one point and fail in another one. It all depends on the current state. Here is an interesting example I ran into: I created my guest on my Athlon and tried to migrate it to my laptop running a Pentium M. The guest failed. When I tried the opposite, creating it on my laptop and migrating it to my Athlon, it worked... Not only that, but now, after being forged in the flamed of the Pentium M, I could migrate it BACK from the Athlon to the laptop...then to an Opteron... That was fun. It can actually function like a roaming guest:-) One problem is that OSs usually gather information on the system they are running, on all the features the CPU offers them. What if one or more of the features is not supported on the target host? You think it will crash? NOT necessarily. It won''t crash if there''s currently no running code on the guest that uses that feature. I believe It will also be a problem if some code that USES such a feature is ON the guest RAM image, but for some reason is not Currently running, until a user dose something...this could lead to seemingly unexplained crashes. I am also interested in future compilations/executions on the migrated OS... It can be affected too... NOW, having said that... completely preventing migration between CPUS that are not 100% compatible may not be a good idea... after all...you may KNOW that the current configuration will work on the other machine (or you may know it was migrated from there In the first place), and you may need to do a hardware upgrade with no downtime... no reason to prevent this... I discovered that as long as you create the guest on the machine with the LEAST amount of features among the ones it may Be migrated to (the one that has NO feature the other ones don''t have) , the migration seems to work fine in every direction. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
In a production environment, no one will rely on migration if it can cause mysterious crashes either of the guest or of applications in the guest at some random point in the future. Xen needs to address this is some manner. It may need an override, if someone really wants to do the migration, but by default it should not allow movement between incompatible CPUs. John Noam Taich wrote:> John Byrne wrote: > >> There are various differences between x86 CPU types that I believe > would cause a guest to fail after being migrated. Are there >checks in > the migration code to prevent this from happening? Does it check for an > "incompatible CPU" and fail early, leaving the >guest running on the > source host? If there is a check, what is its nature? (Exact match of > CPU type/rev or something based on >CPU-features?) > > > >> Thanks, > > > >> John Byrne > > > > No. no checks for CPU type in migration code. And since the RAM image > itself is being copied, there are certainly different > > Problems created. > > But it''s even more complicated than that: the question of whether or not > the migrated guest will fail or not depends on the CURRENT RAM image, > and it may succeed at one point and fail in another one. It all depends > on the current state. > > > > Here is an interesting example I ran into: > > I created my guest on my Athlon and tried to migrate it to my laptop > running a Pentium M. The guest failed. > > When I tried the opposite, creating it on my laptop and migrating it to > my Athlon, it worked... Not only that, but now, after being forged in > the flamed of the Pentium M, I could migrate it BACK from the Athlon to > the laptop...then to an Opteron... That was fun. It can actually > function like a roaming guest:-) > > > > One problem is that OSs usually gather information on the system they > are running, on all the features the CPU offers them. > > What if one or more of the features is not supported on the target host? > You think it will crash? > > NOT necessarily. It won''t crash if there''s currently no running code on > the guest that uses that feature. > > > > I believe It will also be a problem if some code that USES such a > feature is ON the guest RAM image, but for some reason is not > > Currently running, until a user dose something...this could lead to > seemingly unexplained crashes. > > > > I am also interested in future compilations/executions on the migrated > OS... > > It can be affected too... > > > > NOW, having said that... completely preventing migration between CPUS > that are not 100% compatible may not be a good idea... > after all...you may KNOW that the current configuration will work on the > other machine (or you may know it was migrated from there > > In the first place), and you may need to do a hardware upgrade with no > downtime... no reason to prevent this... > > > > I discovered that as long as you create the guest on the machine with > the LEAST amount of features among the ones it may > > Be migrated to (the one that has NO feature the other ones don''t have) , > the migration seems to work fine in every direction. > > > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
that''s indeed a problem. I also expierence the exact same issues when I tried the live-migration feature. It works as long as the destination cpu has more "capatibilities" then the source cpu. For the other direction the domU will most likely restart directly after the migration process is over (at least that is what I saw in all my tests). having a datacenter with only the same type of server is not realistic. no one will buy a hundred servers at once just to have the same system. in real life there will be maybe some servers and if more are needed then new servers (and probablly not the same type) are bought. the question is how to solve... it would for example help if the user could configure which cpu capabilities like sse, mmx, pae and so on are available for the domU kernel. Then the admin could take a look what cpu capatibilities are available on all of his system (at least all systems that are intresting for a live migration) and just use these capabilities (even if it costs some performance). But I''ve got a question, what does linux do? there is cpu hotplugging support in recent 2.6 kernels, right? what happens if a cpu with less features is plugged in a such system and the more powerfull cpu gets removed? does a normal linux system will reboot then? --Ralph p.s.: I also agree that at least xen should warn if someone tries to migrate to a cpu with less capabilities, so that the admin can abort his plan before it''s to late... for example a option: "--force" could be used if the admin want''s to migrate even if xen thinks that this is not a good idea. Am Donnerstag, 16. Februar 2006 17:56 schrieb John Byrne:> In a production environment, no one will rely on migration if it can > cause mysterious crashes either of the guest or of applications in the > guest at some random point in the future. Xen needs to address this is > some manner. It may need an override, if someone really wants to do the > migration, but by default it should not allow movement between > incompatible CPUs. > > John > > Noam Taich wrote: > > John Byrne wrote: > >> There are various differences between x86 CPU types that I believe > > > > would cause a guest to fail after being migrated. Are there >checks in > > the migration code to prevent this from happening? Does it check for an > > "incompatible CPU" and fail early, leaving the >guest running on the > > source host? If there is a check, what is its nature? (Exact match of > > CPU type/rev or something based on >CPU-features?) > > > >> Thanks, > >> > >> > >> > >> John Byrne > > > > No. no checks for CPU type in migration code. And since the RAM image > > itself is being copied, there are certainly different > > > > Problems created. > > > > But it''s even more complicated than that: the question of whether or not > > the migrated guest will fail or not depends on the CURRENT RAM image, > > and it may succeed at one point and fail in another one. It all depends > > on the current state. > > > > > > > > Here is an interesting example I ran into: > > > > I created my guest on my Athlon and tried to migrate it to my laptop > > running a Pentium M. The guest failed. > > > > When I tried the opposite, creating it on my laptop and migrating it to > > my Athlon, it worked... Not only that, but now, after being forged in > > the flamed of the Pentium M, I could migrate it BACK from the Athlon to > > the laptop...then to an Opteron... That was fun. It can actually > > function like a roaming guest:-) > > > > > > > > One problem is that OSs usually gather information on the system they > > are running, on all the features the CPU offers them. > > > > What if one or more of the features is not supported on the target host? > > You think it will crash? > > > > NOT necessarily. It won''t crash if there''s currently no running code on > > the guest that uses that feature. > > > > > > > > I believe It will also be a problem if some code that USES such a > > feature is ON the guest RAM image, but for some reason is not > > > > Currently running, until a user dose something...this could lead to > > seemingly unexplained crashes. > > > > > > > > I am also interested in future compilations/executions on the migrated > > OS... > > > > It can be affected too... > > > > > > > > NOW, having said that... completely preventing migration between CPUS > > that are not 100% compatible may not be a good idea... > > after all...you may KNOW that the current configuration will work on the > > other machine (or you may know it was migrated from there > > > > In the first place), and you may need to do a hardware upgrade with no > > downtime... no reason to prevent this... > > > > > > > > I discovered that as long as you create the guest on the machine with > > the LEAST amount of features among the ones it may > > > > Be migrated to (the one that has NO feature the other ones don''t have) , > > the migration seems to work fine in every direction. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
John Byrne wrote:> > There are various differences between x86 CPU types that I believe > would cause a guest to fail after being migrated. Are there checks in > the migration code to prevent this from happening? Does it check for > an "incompatible CPU" and fail early, leaving the guest running on the > source host? If there is a check, what is its nature? (Exact match of > CPU type/rev or something based on CPU-features?)This, unfortunately, is always going to be a heuristic. There''s no way to guarantee CPU compatibility (except, of course, with VT/SVM). The major problem is the CPUID instruction. If an application obtains CPUID info (which I think can even contain a serial number for the processor), and relies on this info (even for something simple like CPU frequency), it will break when moved to another CPU that doesn''t have the same exact information reported for CPUID. This is a bit contrived but important as a base case to assert that there''s no completely safe heuristic for migration. There are going to, of course, be common cases of incompatibility mostly related to things like the presence of SIMD instructions. One can argue that for a lot server work loads, this isn''t going to be an issue. I don''t think it''s horrible that Xend doesn''t do any checking right now because of this. It certainly wouldn''t be harmful to do some checking and issue a warning to the user before migration. Personally, I think it''s far easier for high level tools to take care of this since they will have a better idea of the cluster''s hardware layout and can make appropriate decisions based on risk compared to availability. Regards, Anthony Liguori> Thanks, > > John Byrne > > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Noam Taich wrote:> > I created my guest on my Athlon and tried to migrate it to my laptop > running a Pentium M. The guest failed. >As a general principle, migrating between AMD and Pentium chipsets is a bad idea. If you google a bit, VMware has VMotion compatibility tables that match which processor families they consider "safe" to migrate to and from. Regards, Anthony Liguori> > When I tried the opposite, creating it on my laptop and migrating it > to my Athlon, it worked… Not only that, but now, after being forged in > the flamed of the Pentium M, I could migrate it BACK from the Athlon > to the laptop…then to an Opteron… That was fun. It can actually > function like a roaming guestJ > > One problem is that OSs usually gather information on the system they > are running, on all the features the CPU offers them. > > What if one or more of the features is not supported on the target > host? You think it will crash? > > NOT necessarily. It won''t crash if there''s currently no running code > on the guest that uses that feature. > > I believe It will also be a problem if some code that USES such a > feature is ON the guest RAM image, but for some reason is not > > Currently running, until a user dose something…this could lead to > seemingly unexplained crashes. > > I am also interested in future compilations/executions on the migrated OS… > > It can be affected too… > > NOW, having said that… completely preventing migration between CPUS > that are not 100% compatible may not be a good idea… > after all…you may KNOW that the current configuration will work on the > other machine (or you may know it was migrated from there > > In the first place), and you may need to do a hardware upgrade with no > downtime… no reason to prevent this… > > I discovered that as long as you create the guest on the machine with > the LEAST amount of features among the ones it may > > Be migrated to (the one that has NO feature the other ones don''t have) > , the migration seems to work fine in every direction. > > ------------------------------------------------------------------------ > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel