Alex Bligh
2013-Feb-07 09:07 UTC
Question re live migrate on Xen 4.2 re different cpu capabilities
We''ve run into an issue on live migrate on Xen 4.2. We''ve mainly tested HVM on qemu-upstream DM with my live migrate patches in, but it seems to apply in any HVM live migrate. What we see is under certain circumstances a linux guest booted on machine A will live-migrate to machine B and back again, but a linux guest booted on machine B will not live migrate to machine A - it hangs after the migration. Tracking this down to different pairs of machines A and B, the difference seems to be down to different CPU capabilities. In particular, if B has identical CPU flags to A except that B supports xsave, the above seems to happen reliably. I presume what may be happening is that the guest notes and uses the presence of xsave on boot, and becomes unhappy when the recipient machine does not have it. KVM with default settings does not suffer from this issue. I presume it is presenting by default a masked set of CPU capabilities. Is there any way to do similarly in Xen? Or is Xen live migrate effectively restricted to live migrate between identical CPUs? -- Alex Bligh
Alex Bligh
2013-Feb-07 09:08 UTC
Question re live migrate on Xen 4.2 re different cpu capabilities
We''ve run into an issue on live migrate on Xen 4.2. We''ve mainly tested HVM on qemu-upstream DM with my live migrate patches in, but it seems to apply in any HVM live migrate. What we see is under certain circumstances a linux guest booted on machine A will live-migrate to machine B and back again, but a linux guest booted on machine B will not live migrate to machine A - it hangs after the migration. Tracking this down to different pairs of machines A and B, the difference seems to be down to different CPU capabilities. In particular, if B has identical CPU flags to A except that B supports xsave, the above seems to happen reliably. I presume what may be happening is that the guest notes and uses the presence of xsave on boot, and becomes unhappy when the recipient machine does not have it. KVM with default settings does not suffer from this issue. I presume it is presenting by default a masked set of CPU capabilities. Is there any way to do similarly in Xen? Or is Xen live migrate effectively restricted to live migrate between identical CPUs? -- Alex Bligh
Olaf Hering
2013-Feb-07 09:19 UTC
Re: Question re live migrate on Xen 4.2 re different cpu capabilities
On Thu, Feb 07, Alex Bligh wrote:> Is there any way to do similarly in Xen? Or is Xen live migrate effectively > restricted to live migrate between identical CPUs?Its mentionend somewhere in the docs that both cpus must be sufficient compatible. Use the cpuid= setting in the domU config file to enforce this. Olaf
Ian Campbell
2013-Feb-07 09:47 UTC
Re: Question re live migrate on Xen 4.2 re different cpu capabilities
On Thu, 2013-02-07 at 09:19 +0000, Olaf Hering wrote:> On Thu, Feb 07, Alex Bligh wrote: > > > Is there any way to do similarly in Xen? Or is Xen live migrate effectively > > restricted to live migrate between identical CPUs? > > Its mentionend somewhere in the docs that both cpus must be sufficient > compatible. Use the cpuid= setting in the domU config file to enforce > this.You can also level down an entire host using the cpuid_mask_* command line options described in http://xenbits.xen.org/docs/unstable/misc/xen-command-line.html Ian.
Alex Bligh
2013-Feb-07 10:59 UTC
Re: Question re live migrate on Xen 4.2 re different cpu capabilities
Ian, Olaf, --On 7 February 2013 09:47:01 +0000 Ian Campbell <Ian.Campbell@citrix.com> wrote:> On Thu, 2013-02-07 at 09:19 +0000, Olaf Hering wrote: >> On Thu, Feb 07, Alex Bligh wrote: >> >> > Is there any way to do similarly in Xen? Or is Xen live migrate >> > effectively restricted to live migrate between identical CPUs? >> >> Its mentionend somewhere in the docs that both cpus must be sufficient >> compatible. Use the cpuid= setting in the domU config file to enforce >> this. > > You can also level down an entire host using the cpuid_mask_* command > line options described in > http://xenbits.xen.org/docs/unstable/misc/xen-command-line.htmlThanks - that''s really helpful. Looking at the documentation for cpuidI can''t see an obvious way of saying ''mask everything except the least common denominator of features'' using the libxl method, without having prior knowledge of what each version of xen supports. I''m not even clear how to do this on the xend method (obviously I want to pass long bitstrings of zeros, but how many?). Am I missing something? -- Alex Bligh
Olaf Hering
2013-Feb-07 15:11 UTC
Re: Question re live migrate on Xen 4.2 re different cpu capabilities
On Thu, Feb 07, Alex Bligh wrote:> >You can also level down an entire host using the cpuid_mask_* command > >line options described in > >http://xenbits.xen.org/docs/unstable/misc/xen-command-line.html > > Thanks - that''s really helpful. Looking at the documentation for cpuid> I can''t see an obvious way of saying ''mask everything except the > least common denominator of features'' using the libxl method, without > having prior knowledge of what each version of xen supports. I''m not > even clear how to do this on the xend method (obviously I want to > pass long bitstrings of zeros, but how many?). Am I missing something?cpuid= is what the guest sees, so its not so much a feature of Xen itself but what the host cpu provides. I''m not sure if Xen can emulate certain important bits. The wikipedia CPUID entry has a list what each bit means. Olaf
Alex Bligh
2013-Feb-07 15:59 UTC
Re: Question re live migrate on Xen 4.2 re different cpu capabilities
Olaf, --On 7 February 2013 16:11:26 +0100 Olaf Hering <olaf@aepfle.de> wrote:> cpuid= is what the guest sees, so its not so much a feature of Xen > itself but what the host cpu provides. I''m not sure if Xen can emulate > certain important bits. The wikipedia CPUID entry has a list what each > bit means.I know what how to find out what the bits mean. But in some sense I don''t care. What I want to do is take a particular set of cpu features (which for the sake of argument I will do by booting a kvm domain), and say "mask off any additional cpuid flags beyond these", so if a new feature gets introduced in the Xen codebase, it won''t show in the guest. I think the only thing I can do at the moment is check the Xen code for every cpu feature Xen knows about, remove those I still want, and mask off the rest. What I actually want to do is to say "please don''t expose any features other than these ones, and only expose these if the host supports them", so that if we change versions of Xen to one supporting another feature, we won''t need to poke around in every domain config. -- Alex Bligh
Ian Campbell
2013-Feb-07 16:16 UTC
Re: Question re live migrate on Xen 4.2 re different cpu capabilities
On Thu, 2013-02-07 at 15:59 +0000, Alex Bligh wrote:> Olaf, > > --On 7 February 2013 16:11:26 +0100 Olaf Hering <olaf@aepfle.de> wrote: > > > cpuid= is what the guest sees, so its not so much a feature of Xen > > itself but what the host cpu provides. I''m not sure if Xen can emulate > > certain important bits. The wikipedia CPUID entry has a list what each > > bit means. > > I know what how to find out what the bits mean. But in some sense > I don''t care. > > What I want to do is take a particular set of cpu features (which for the > sake of argument I will do by booting a kvm domain), and say "mask off > any additional cpuid flags beyond these", so if a new feature gets > introduced in the Xen codebase, it won''t show in the guest. I think the > only thing I can do at the moment is check the Xen code for every cpu > feature Xen knows about, remove those I still want, and mask off the rest. > What I actually want to do is to say "please don''t expose any features > other than these ones, and only expose these if the host supports them", > so that if we change versions of Xen to one supporting another feature, > we won''t need to poke around in every domain config.I think you can do this using what xl.cfg(5) describes as the "xend" syntax, by setting all the ones you aren''t explicitly exposing to 0. The "libxl syntax" is currently "=host,flag=value,..." which starts from the host and modifies the flags. I can''t offhand thing of a reason why we wouldn''t also want to support a different keyword ("explicit"?) which means "starting from an empty slate add these". Patches accepted... Ian.
Jan Beulich
2013-Feb-07 16:19 UTC
Re: Question re live migrate on Xen 4.2 re different cpu capabilities
>>> On 07.02.13 at 16:59, Alex Bligh <alex@alex.org.uk> wrote: > What I want to do is take a particular set of cpu features (which for the > sake of argument I will do by booting a kvm domain), and say "mask off > any additional cpuid flags beyond these", so if a new feature gets > introduced in the Xen codebase, it won''t show in the guest. I think the > only thing I can do at the moment is check the Xen code for every cpu > feature Xen knows about, remove those I still want, and mask off the rest. > What I actually want to do is to say "please don''t expose any features > other than these ones, and only expose these if the host supports them", > so that if we change versions of Xen to one supporting another feature, > we won''t need to poke around in every domain config.You''re heading in a slightly wrong direction here: You don''t really care what features Xen know about or supports. What you do care about is what features your DomU-s get to see. And that''s where the masking comes into play (and why this can be done on a per guest basis as well as at the host level). Jan
Alex Bligh
2013-Feb-07 16:59 UTC
Re: Question re live migrate on Xen 4.2 re different cpu capabilities
--On 7 February 2013 16:19:29 +0000 Jan Beulich <JBeulich@suse.com> wrote:> You''re heading in a slightly wrong direction here: You don''t really > care what features Xen know about or supports. What you do care > about is what features your DomU-s get to see. And that''s where > the masking comes into play (and why this can be done on a per > guest basis as well as at the host level).I want my domUs to see a no more than a fixed set of CPU flags (obviously if those CPU flags aren''t present, I don''t want to lie that they are). I have no visibility of what hardware my software may be installed on in the future. EG if Intel introduces a megawidget CPU flag and Xen adds support for it, I want to be guaranteed this is masked out as it will break live migrate between megawidget and non-megawidget compatible machines. -- Alex Bligh
Alex Bligh
2013-Feb-07 16:59 UTC
Re: Question re live migrate on Xen 4.2 re different cpu capabilities
Ian, --On 7 February 2013 16:16:55 +0000 Ian Campbell <Ian.Campbell@citrix.com> wrote:> I think you can do this using what xl.cfg(5) describes as the "xend" > syntax, by setting all the ones you aren''t explicitly exposing to 0.OK thanks.> The "libxl syntax" is currently "=host,flag=value,..." which starts from > the host and modifies the flags. I can''t offhand thing of a reason why > we wouldn''t also want to support a different keyword ("explicit"?) which > means "starting from an empty slate add these". Patches accepted...I may just take you up on that. -- Alex Bligh
Olaf Hering
2013-Feb-08 13:36 UTC
Re: Question re live migrate on Xen 4.2 re different cpu capabilities
On Thu, Feb 07, Alex Bligh wrote:> > > --On 7 February 2013 16:19:29 +0000 Jan Beulich <JBeulich@suse.com> wrote: > > >You''re heading in a slightly wrong direction here: You don''t really > >care what features Xen know about or supports. What you do care > >about is what features your DomU-s get to see. And that''s where > >the masking comes into play (and why this can be done on a per > >guest basis as well as at the host level). > > I want my domUs to see a no more than a fixed set of CPU flags > (obviously if those CPU flags aren''t present, I don''t want to > lie that they are). I have no visibility of what hardware my > software may be installed on in the future. EG if Intel introduces > a megawidget CPU flag and Xen adds support for it, I want to be > guaranteed this is masked out as it will break live migrate > between megawidget and non-megawidget compatible machines.I''m not sure what the question is, in a recent bug I had to disable popcnt because one of the hosts did not have it. So I came up with this config entry, which disables popcnt and sse4* bits: cpuid=[ ''1:ecx=xxxxxxxx0xx00xxxxxxxxxxxxxxxxxxx'' ] I think you have to read through the CPUID entry and set a few "required" bits and a few "common across your hardware zoo" bits to 1 and set everything else to 0 to handle the upcoming megawidget bit: http://en.wikipedia.org/wiki/CPUID If you use xend, you may need this patch to handle cpuid properly: Only add cpuid and cpuid_check to sexpr once When converting a XendConfig object to sexpr, cpuid and cpuid_check were being emitted twice in the resulting sexpr. The first conversion writes incorrect sexpr, causing parsing of the sexpr to fail when xend is restarted and domain sexpr files in /var/lib/xend/domains/<dom-uuid> are read and parsed. This patch skips the first conversion, and uses only the custom cpuid{_check} conversion methods called later. It is not pretty, but is the least invasive fix in this complex code. Index: xen-4.2.0-testing/tools/python/xen/xend/XendConfig.py ==================================================================--- xen-4.2.0-testing.orig/tools/python/xen/xend/XendConfig.py +++ xen-4.2.0-testing/tools/python/xen/xend/XendConfig.py @@ -1126,6 +1126,10 @@ class XendConfig(dict): else: for name, typ in XENAPI_CFG_TYPES.items(): if name in self and self[name] not in (None, []): + # Skip cpuid and cpuid_check. Custom conversion + # methods for these are called below. + if name in ("cpuid", "cpuid_check"): + continue if typ == dict: s = self[name].items() elif typ == list: Olaf
Konrad Rzeszutek Wilk
2013-Feb-08 20:04 UTC
Re: Question re live migrate on Xen 4.2 re different cpu capabilities
On Fri, Feb 8, 2013 at 8:36 AM, Olaf Hering <olaf@aepfle.de> wrote:> On Thu, Feb 07, Alex Bligh wrote: > >> >> >> --On 7 February 2013 16:19:29 +0000 Jan Beulich <JBeulich@suse.com> wrote: >> >> >You''re heading in a slightly wrong direction here: You don''t really >> >care what features Xen know about or supports. What you do care >> >about is what features your DomU-s get to see. And that''s where >> >the masking comes into play (and why this can be done on a per >> >guest basis as well as at the host level). >> >> I want my domUs to see a no more than a fixed set of CPU flags >> (obviously if those CPU flags aren''t present, I don''t want to >> lie that they are). I have no visibility of what hardware my >> software may be installed on in the future. EG if Intel introduces >> a megawidget CPU flag and Xen adds support for it, I want to be >> guaranteed this is masked out as it will break live migrate >> between megawidget and non-megawidget compatible machines. > > I''m not sure what the question is, in a recent bug I had to disable > popcnt because one of the hosts did not have it. So I came up with this > config entry, which disables popcnt and sse4* bits: > > cpuid=[ ''1:ecx=xxxxxxxx0xx00xxxxxxxxxxxxxxxxxxx'' ] > > I think you have to read through the CPUID entry and set a few "required" bits > and a few "common across your hardware zoo" bits to 1 and set everything else > to 0 to handle the upcoming megawidget bit: > http://en.wikipedia.org/wiki/CPUID > > > If you use xend, you may need this patch to handle cpuid properly: > > Only add cpuid and cpuid_check to sexpr once > > When converting a XendConfig object to sexpr, cpuid and cpuid_check > were being emitted twice in the resulting sexpr. The first conversion > writes incorrect sexpr, causing parsing of the sexpr to fail when xend > is restarted and domain sexpr files in /var/lib/xend/domains/<dom-uuid> > are read and parsed. > > This patch skips the first conversion, and uses only the custom > cpuid{_check} conversion methods called later. It is not pretty, but > is the least invasive fix in this complex code.I recall seeing that libvirt had some of this figured out. It would know which CPUID flags each CPU family had - and you could actually set (''I am a Westmere CPU'') or it would use the lowest common CPU family support for all the guest. Granted that means you need to know _which_ of the machines has the lowest common CPU family first. Or you set the guest to say ''Core'' . Anyhow, perhaps looking at libvirt and implementing something similar in ''xl'' would be beneficial for these issues?
Olaf Hering
2013-Feb-11 16:18 UTC
Re: Question re live migrate on Xen 4.2 re different cpu capabilities
On Fri, Feb 08, Konrad Rzeszutek Wilk wrote:> I recall seeing that libvirt had some of this figured out. It would > know which CPUID flags each CPU family had - and you could actually > set (''I am a Westmere CPU'') or it would use the lowest common CPU > family support for all the guest. > > Granted that means you need to know _which_ of the machines has the > lowest common CPU family first. Or you set the guest to say ''Core'' . > > Anyhow, perhaps looking at libvirt and implementing something similar > in ''xl'' would be beneficial for these issues?I havent looked at the code, but I can imagine it does that via qemu. But it would be nice thing to have proper cpuid= handling were libvirt can force a certain cpu type. Olaf