Graham, Simon
2007-Feb-14 15:08 UTC
RE: [Xen-devel] DomU crash during migration when suspendingsource domain
> In this particular case it is quite arguable that > cache_remove_shared_cpu_map() should check cpuid4_info[i]!=NULL, just > as > done in cache_shared_cpu_map_setup(). I can make this fix in our tree > but > something similar ought to be submitted upstream too. I''m pretty > certain > that this will fix your crash. >Let me try that out here and get back to you -- I can submit a patch with this specific fix in if it solves the problem. Since, as you say, this is just one aspect of dealing with hot plugging completely different processors, I somehow feel that a point fix like this wouldn''t be accepted upstream and instead we''d need to think about a more complete solution (If, indeed, this is feasible).> > Upgrading upwards actually tends to be okay. I can''t think of any > practical > examples of how that might fail. After all, worst case we can hide the > extra > features from the guest since we have some control over CPUID. > *Downgrading* > is the problem!Understood... I can conceive of cases where this would not be true, but I agree that Intel/AMD usually do a good job of ensuring backward compatibility so we could hide the newer features until all systems have the newer processors in place and you reboot the domains. Simon _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Feb-14 15:43 UTC
Re: [Xen-devel] DomU crash during migration when suspendingsource domain
On 14/2/07 15:08, "Graham, Simon" <Simon.Graham@stratus.com> wrote:> Let me try that out here and get back to you -- I can submit a patch > with this specific fix in if it solves the problem. > > Since, as you say, this is just one aspect of dealing with hot plugging > completely different processors, I somehow feel that a point fix like > this wouldn''t be accepted upstream and instead we''d need to think about > a more complete solution (If, indeed, this is feasible).Possibly true. In fact I think if you fix that function then you''re going to die in kobject_unregister() instead. The loop cache_remove_dev() is simply bogus in your case since num_cache_leaves cannot be trusted. A broader set of fixes might get accepted upstream because cache_add_dev() can fail for other reasons too (at least out-of-memory) and any such failure will cause cache_remove_dev() to barf. But it''s not such a simple thing to fix and it does not solve the general problem for us. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel