Hi everybody, I have setup Xen 4.1.2 on OpenSuse 12.1. It is running fine so far, but live migration does not work. While the source host works on the "xm migrate -l ..." command, I can see the VM on the target host in paused state in "xm list", but after a while it vanishes while the xm command on the source machine returns whitout any errors. (One machine has core i7 processor while another is core 2 quad system). I have searched the net and this list and found a few posts mentioning this error, but no solution, not even a hint on the source of the problem. Any help would be really appreciated. Thanks. best wishes _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
>>> On 06.06.12 at 09:00, elahe shekuhi <e.shekuhi@gmail.com> wrote: > I have setup Xen 4.1.2 on OpenSuse 12.1. It is running fine so far, but live > migration does not work. While the source host works on the "xm migrate -l > ..." command, I can see the VM on the target host in paused state in "xm > list", but after a while it vanishes while the xm command on the source > machine returns whitout any errors.Does the guest crash perhaps? In which case a kernel log from it might turn out pretty useful. As would technical data (rather than a simple "does not work") in the first place (hypervisor, xend, and Dom0 kernel logs are all possible candidates for holding relevant information).> (One machine has core i7 processor while another is core 2 > quad system).Does migration fail in both directions? Or perhaps just from the newer to the older system (in which case I would guess you''re not masking features properly on the newer one)? Jan
On Wed, 2012-06-06 at 08:41 +0100, Jan Beulich wrote:> >>> On 06.06.12 at 09:00, elahe shekuhi <e.shekuhi@gmail.com> wrote: > > I have setup Xen 4.1.2 on OpenSuse 12.1. It is running fine so far, but live > > migration does not work. While the source host works on the "xm migrate -l > > ..." command, I can see the VM on the target host in paused state in "xm > > list", but after a while it vanishes while the xm command on the source > > machine returns whitout any errors. > > Does the guest crash perhaps? In which case a kernel log from it > might turn out pretty useful. As would technical data (rather than > a simple "does not work") in the first place (hypervisor, xend, and > Dom0 kernel logs are all possible candidates for holding relevant > information).Elahe, please see http://wiki.xen.org/wiki/Reporting_Bugs_against_Xen for advice about the kind of information which you can usefully include in a bug report. Thanks, Ian.> > > (One machine has core i7 processor while another is core 2 > > quad system). > > Does migration fail in both directions? Or perhaps just from the > newer to the older system (in which case I would guess you''re > not masking features properly on the newer one)? > > Jan > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Hi, Thanks for your attention. On Wed, Jun 6, 2012 at 8:41 AM, Jan Beulich <JBeulich@suse.com> wrote:> >>> On 06.06.12 at 09:00, elahe shekuhi <e.shekuhi@gmail.com> wrote: > > I have setup Xen 4.1.2 on OpenSuse 12.1. It is running fine so far, but > live > > migration does not work. While the source host works on the "xm migrate > -l > > ..." command, I can see the VM on the target host in paused state in "xm > > list", but after a while it vanishes while the xm command on the source > > machine returns whitout any errors. > > Does the guest crash perhaps? In which case a kernel log from it > might turn out pretty useful. As would technical data (rather than > a simple "does not work") in the first place (hypervisor, xend, and > Dom0 kernel logs are all possible candidates for holding relevant > information). > > > (One machine has core i7 processor while another is core 2 > > quad system). > > Does migration fail in both directions? Or perhaps just from the > newer to the older system (in which case I would guess you''re > not masking features properly on the newer one)? >Yes, Migration failed in both directions, but the error in Xend.log is different. When I do migrate VM from core 2 quad to core i7 machine the error in Xend.log is [2012-04-23 17:25:55 1405] DEBUG (XendDomainInfo:237) XendDomainInfo.restore([''domain'', [''domid'', ''5''], [''cpu_weight'', ''256''], [''cpu_cap'', ''0''], [''pool_name'', ''Pool-0''], [''bootloader'', ''/usr/bin/pygrub''], [''vcpus'', ''2''], [''cpus'', [[], []]], [''on_poweroff'', ''destroy''], [''description'', ''None''], [''on_crash'', ''destroy''], [''uuid'', ''9d19e419-83ca-e877-52e4-0f93f634354a''], [''bootloader_args'', ''-q''], [''name'', ''sles11-1''], [''on_reboot'', ''restart''], [''maxmem'', ''1024''], [''memory'', ''512''], [''shadow_memory'', ''0''], [''vcpu_avail'', ''3''], [''features'', ''''], [''on_xend_start'', ''ignore''], [''on_xend_stop'', ''ignore''], [''start_time'', ''1335185584.3''], [''cpu_time'', ''0.353877786''], [''online_vcpus'', ''2''], [''image'', [''linux'', [''kernel'', ''''], [''args'', '' ''], [''superpages'', ''0''], [''videoram'', ''4''], [''pci'', []], [''nomigrate'', ''0''], [''tsc_mode'', ''0''], [''device_model'', ''/usr/lib/xen/bin/qemu-dm''], [''notes'', [''FEATURES'', ''writable_page_tables|writable_descriptor_tables|auto_translated_physmap|pae_pgdir_above_4gb|supervisor_mode_kernel''], [''VIRT_BASE'', ''18446744071562067968''], [''GUEST_VERSION'', ''2.6''], [''PADDR_OFFSET'', ''0''], [''GUEST_OS'', ''linux''], [''HYPERCALL_PAGE'', ''18446744071564193792''], [''LOADER'', ''generic''], [''SUSPEND_CANCEL'', ''1''], [''ENTRY'', ''18446744071564165120''], [''XEN_VERSION'', ''xen-3.0'']]]], [''status'', ''2''], [''state'', ''-b----''], [''store_mfn'', ''470107''], [''console_mfn'', ''470106''], [''device'', [''vif'', [''bridge'', ''br0''], [''mac'', ''00:16:3e:44:ac:23''], [''script'', ''/etc/xen/scripts/vif-bridge''], [''uuid'', ''f3a500be-df8d-a7fe-881e-b8d8347dce74''], [''backend'', ''0'']]], [''device'', [''vkbd'', [''backend'', ''0'']]], [''device'', [''console'', [''protocol'', ''vt100''], [''location'', ''2''], [''uuid'', ''0debd581-e78c-c87c-8012-330fa7d0eafb'']]], [''device'', [''vbd'', [''protocol'', ''x86_64-abi''], [''uuid'', ''8e7150c8-4af1-88c0-e71b-16049663cebc''], [''bootable'', ''1''], [''dev'', ''xvda:disk''], [''uname'', ''file:/home/elahe/xen/images/sles11-1/disk0.raw''], [''mode'', ''w''], [''backend'', ''0''], [''VDI'', '''']]], [''device'', [''vbd'', [''protocol'', ''x86_64-abi''], [''uuid'', ''55ff031d-7810-1b41-ed7b-f7fce59c6dcd''], [''bootable'', ''0''], [''dev'', ''xvdb:cdrom''], [''uname'', ''phy:/dev/sr0''], [''mode'', ''r''], [''backend'', ''0''], [''VDI'', '''']]], [''device'', [''vfb'', [''vncunused'', ''1''], [''vnc'', ''1''], [''xauthority'', ''/root/.Xauthority''], [''keymap'', ''en-us''], [''location'', ''127.0.0.1:5900''], [''uuid'', ''3d8f53d2-97b8-770b-59c7-69ac1cd302fa'']]], [''change_home_server'', ''False'']]) [2012-04-23 17:25:55 1405] DEBUG (XendDomainInfo:2562) XendDomainInfo.constructDomain [2012-04-23 17:25:55 1405] DEBUG (balloon:206) Balloon: 538624 KiB free; need 16384; done. [2012-04-23 17:25:55 1405] DEBUG (XendDomain:482) Adding Domain: 4 [2012-04-23 17:25:55 1405] DEBUG (XendDomainInfo:3514) Storing VM details: {''on_xend_stop'': ''ignore'', ''pool_name'': ''Pool-0'', ''shadow_memory'': ''0'', ''uuid'': ''9d19e419-83ca-e877-52e4-0f93f634354a'', ''on_reboot'': ''restart'', ''start_time'': ''1335185584.3'', ''on_poweroff'': ''destroy'', ''bootloader_args'': ''-q'', ''on_xend_start'': ''ignore'', ''on_crash'': ''destroy'', ''xend/restart_count'': ''0'', ''vcpus'': ''2'', ''vcpu_avail'': ''3'', ''bootloader'': ''/usr/bin/pygrub'', ''image'': "(linux (kernel '''') (args '' '') (superpages 0) (videoram 4) (pci ()) (nomigrate 0) (tsc_mode 0) (device_model /usr/lib/xen/bin/qemu-dm) (notes (FEATURES ''writable_page_tables|writable_descriptor_tables|auto_translated_physmap|pae_pgdir_above_4gb|supervisor_mode_kernel'') (VIRT_BASE 18446744071562067968) (GUEST_VERSION 2.6) (PADDR_OFFSET 0) (GUEST_OS linux) (HYPERCALL_PAGE 18446744071564193792) (LOADER generic) (SUSPEND_CANCEL 1) (ENTRY 18446744071564165120) (XEN_VERSION xen-3.0)))", ''name'': ''sles11-1''} [2012-04-23 17:25:55 1405] DEBUG (image:343) No VNC passwd configured for vfb access [2012-04-23 17:25:55 1405] DEBUG (XendCheckpoint:359) restore:shadow=0x0, _static_max=0x40000000, _static_min=0x0, [2012-04-23 17:25:55 1405] DEBUG (XendCheckpoint:386) [xc_restore]: /usr/lib64/xen/bin/xc_restore 4 4 1 2 0 0 0 0 [2012-04-23 17:26:41 1405] INFO (XendCheckpoint:487) xc: error: Couldn''t set eXtended States for vcpu0 (22 = Invalid argument): Internal error [2012-04-23 17:26:41 1405] DEBUG (XendDomainInfo:3150) XendDomainInfo.destroy: domid=4 [2012-04-23 17:26:41 1405] ERROR (XendDomainInfo:3164) XendDomainInfo.destroy: domain destruction failed. Traceback (most recent call last): File "/usr/lib64/python2.7/site-packages/xen/xend/XendDomainInfo.py", line 3157, in destroy xc.domain_pause(self.domid) Error: (3, ''No such process'') [2012-04-23 17:26:41 1405] DEBUG (XendDomainInfo:2470) No device model [2012-04-23 17:26:41 1405] DEBUG (XendDomainInfo:2472) Releasing devices [2012-04-23 17:26:41 1405] ERROR (XendCheckpoint:421) /usr/lib64/xen/bin/xc_restore 4 4 1 2 0 0 0 0 failed Traceback (most recent call last): File "/usr/lib64/python2.7/site-packages/xen/xend/XendCheckpoint.py", line 390, in restore forkHelper(cmd, fd, handler.handler, True) File "/usr/lib64/python2.7/site-packages/xen/xend/XendCheckpoint.py", line 475, in forkHelper raise XendError("%s failed" % string.join(cmd)) XendError: /usr/lib64/xen/bin/xc_restore 4 4 1 2 0 0 0 0 failed [2012-04-23 17:26:41 1405] ERROR (XendDomain:1200) Restore failed Traceback (most recent call last): File "/usr/lib64/python2.7/site-packages/xen/xend/XendDomain.py", line 1184, in domain_restore_fd dominfo = XendCheckpoint.restore(self, fd, paused=paused, relocating=relocating) File "/usr/lib64/python2.7/site-packages/xen/xend/XendCheckpoint.py", line 422, in restore raise exn XendError: /usr/lib64/xen/bin/xc_restore 4 4 1 2 0 0 0 0 failed I would be really appreciated. Best, elahe> > Jan > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On 06/06/2012 08:15 PM, elahe shekuhi wrote:> Hi, >Hi! Maybe you could try using xl instead of xm. They behave differently concerning sources of errors imho. hth, Mark
Hi, How to use "xl"? Should xend be disable? How do I configure host and destination machines for live migration? Is it as before like for xm utility? On Thu, Jun 7, 2012 at 7:42 AM, Mark Dokter <dokter@icg.tugraz.at> wrote:> On 06/06/2012 08:15 PM, elahe shekuhi wrote: > > Hi, > > > > Hi! > > Maybe you could try using xl instead of xm. They behave differently > concerning sources of errors imho. > > hth, Mark > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Hi, I try with xl tool, but the error is the same. On Thu, Jun 7, 2012 at 8:57 AM, elahe shekuhi <e.shekuhi@gmail.com> wrote:> Hi, > > How to use "xl"? Should xend be disable? How do I configure host and > destination machines for live migration? Is it as before like for xm > utility? > > > On Thu, Jun 7, 2012 at 7:42 AM, Mark Dokter <dokter@icg.tugraz.at> wrote: > >> On 06/06/2012 08:15 PM, elahe shekuhi wrote: >> > Hi, >> > >> >> Hi! >> >> Maybe you could try using xl instead of xm. They behave differently >> concerning sources of errors imho. >> >> hth, Mark >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel >> > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
>>> elahe shekuhi <e.shekuhi@gmail.com> 06/06/12 8:15 PM >>> >On Wed, Jun 6, 2012 at 8:41 AM, Jan Beulich <JBeulich@suse.com> wrote: >> Does migration fail in both directions? Or perhaps just from the >> newer to the older system (in which case I would guess you''re >> not masking features properly on the newer one)? >> > >Yes, Migration failed in both directions, but the error in Xend.log is >different. When I do migrate VM from core 2 quad to core i7 machine the >error in Xend.log isAre you certain (i.e. is this really with the VM freshly started on the Core2 system)? I ask because this ...>[2012-04-23 17:26:41 1405] INFO (XendCheckpoint:487) xc: error: Couldn''t >set eXtended States for vcpu0 (22 = Invalid argument): Internal error... indicates that XSAVE state was available for the guest, but couldn''t be restored, yet iirc Core2 doesn''t know XSAVE yet. Irrespective of that you could try booting the hypervisors on both sides with "no-xsave" and see whether that makes things any better (i.e. work at least in one direction). But generally, if you opened an entry in our bugzilla for this, we''d tell you that migration is only supported between feature-identical machines, or at best with features masked to the common subset on both systems. Jan
On Wed, Jun 06, elahe shekuhi wrote:> (One machine has core i7 processor while another is core 2 quad system). > > > I have searched the net and this list and found a few posts mentioning this > error, but no solution, not even a hint > on the source of the problem.As mentioned by others, for migration (or save/restore) either both systems have to have identical cpus or the guests view of the cpu has to be adjusted with the cpuid= config option in the vm config file. In other words, the "least common denominator" of cpu features has to be configured. See my recent post which tries to explain the cpuid= config option: http://lists.xen.org/archives/html/xen-devel/2012-06/msg00303.html There is an example in there for xend, and also a link to wikipedia which has a good list of cpu features. I suggest to boot both systems with a native kernel and compare the cpu features. Perhaps with something like this: grep -wm1 ^flags /proc/cpuinfo | xargs -n1 | sort > /share_dir/core2.txt grep -wm1 ^flags /proc/cpuinfo | xargs -n1 | sort > /share_dir/i7.txt diff -u /share_dir/core2.txt /share_dir/i7.txt I think most of the lines starting with '+' are features which exist only in i7, which can be hidden with the cpuid= option. Make sure to install the xen packages from the 12.1 update repository since they contain two fixes for your environment: an xsave related bug was fixed, and also guests with a cpuid= option were not restored at all during system startup. Olaf _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Thu, Jun 7, 2012 at 7:59 PM, Jan Beulich <jbeulich@suse.com> wrote:> >>> elahe shekuhi <e.shekuhi@gmail.com> 06/06/12 8:15 PM >>> > >On Wed, Jun 6, 2012 at 8:41 AM, Jan Beulich <JBeulich@suse.com> wrote: > >> Does migration fail in both directions? Or perhaps just from the > >> newer to the older system (in which case I would guess you''re > >> not masking features properly on the newer one)? > >> > > > >Yes, Migration failed in both directions, but the error in Xend.log is > >different. When I do migrate VM from core 2 quad to core i7 machine the > >error in Xend.log is > > Are you certain (i.e. is this really with the VM freshly started on the > Core2 > system)? I ask because this ... >Yes I''m certain that the VM freshly started on the Core2 system. When I migrate the VM from Core2 to Core i7 I always see this error(i.e. "Couldn''t set eXtended States for vcpu0 (22 = Invalid argument): Internal error" ) in the Xend.log on Core i7 machine.> > >[2012-04-23 17:26:41 1405] INFO (XendCheckpoint:487) xc: error: Couldn''t > >set eXtended States for vcpu0 (22 = Invalid argument): Internal error > > ... indicates that XSAVE state was available for the guest, but couldn''t > be restored, yet iirc Core2 doesn''t know XSAVE yet. > > Irrespective of that you could try booting the hypervisors on both sides > with "no-xsave" and see whether that makes things any better (i.e. > work at least in one direction). But generally, if you opened an entry in > our bugzilla for this, we''d tell you that migration is only supported > between feature-identical machines, or at best with features masked > to the common subset on both systems. >Also, when I migrate a VM freshly started on the Core i7 system to core 2, I get another error in xend.log on core2 system(i.e. destination system). The error is [2012-06-07 12:03:31 30425] DEBUG (XendDomainInfo:3150) XendDomainInfo.destroy: domid=2 [2012-06-07 12:03:31 30425] DEBUG (XendDomainInfo:2465) Destroying device model [2012-06-07 12:03:31 30425] INFO (image:711) s1 device model terminated [2012-06-07 12:03:31 30425] DEBUG (XendDomainInfo:2472) Releasing devices [2012-06-07 12:03:31 30425] DEBUG (XendDomainInfo:2478) Removing vif/0 [2012-06-07 12:03:31 30425] DEBUG (XendDomainInfo:1278) XendDomainInfo.destroyDevice: deviceClass = vif, device = vif/0 [2012-06-07 12:03:31 30425] DEBUG (XendDomainInfo:2478) Removing vkbd/0 [2012-06-07 12:03:31 30425] DEBUG (XendDomainInfo:1278) XendDomainInfo.destroyDevice: deviceClass = vkbd, device = vkbd/0 [2012-06-07 12:03:31 30425] DEBUG (XendDomainInfo:2478) Removing console/0 [2012-06-07 12:03:31 30425] DEBUG (XendDomainInfo:1278) XendDomainInfo.destroyDevice: deviceClass = console, device = console/0 [2012-06-07 12:03:32 30425] DEBUG (XendDomainInfo:2478) Removing vbd/51728 [2012-06-07 12:03:32 30425] DEBUG (XendDomainInfo:1278) XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/51728 [2012-06-07 12:03:32 30425] DEBUG (XendDomainInfo:2478) Removing vfb/0 [2012-06-07 12:03:32 30425] DEBUG (XendDomainInfo:1278) XendDomainInfo.destroyDevice: deviceClass = vfb, device = vfb/0 [2012-06-07 12:03:32 30425] DEBUG (XendDomainInfo:2470) No device model [2012-06-07 12:03:32 30425] DEBUG (XendDomainInfo:2472) Releasing devices [2012-06-07 12:03:32 30425] DEBUG (XendDomainInfo:2478) Removing vif/0 [2012-06-07 12:03:32 30425] DEBUG (XendDomainInfo:1278) XendDomainInfo.destroyDevice: deviceClass = vif, device = vif/0 [2012-06-07 12:03:32 30425] DEBUG (XendDomainInfo:2478) Removing vbd/51728 [2012-06-07 12:03:32 30425] DEBUG (XendDomainInfo:1278) XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/51728 [2012-06-07 12:03:32 30425] ERROR (XendCheckpoint:421) Device 51712 (vbd) could not be connected. losetup /dev/loop0 /home/elahe/xen/images/sles11/disk0.raw failed Traceback (most recent call last): File "/usr/lib64/python2.7/site-packages/xen/xend/XendCheckpoint.py", line 412, in restore wait_devs(dominfo) File "/usr/lib64/python2.7/site-packages/xen/xend/XendCheckpoint.py", line 277, in wait_devs dominfo.waitForDevices() # Wait for backends to set up File "/usr/lib64/python2.7/site-packages/xen/xend/XendDomainInfo.py", line 1239, in waitForDevices self.getDeviceController(devclass).waitForDevices() File "/usr/lib64/python2.7/site-packages/xen/xend/server/DevController.py", line 140, in waitForDevices return map(self.waitForDevice, self.deviceIDs()) File "/usr/lib64/python2.7/site-packages/xen/xend/server/DevController.py", line 168, in waitForDevice "%s" % (devid, self.deviceClass, err)) VmError: Device 51712 (vbd) could not be connected. losetup /dev/loop0 /home/elahe/xen/images/sles11/disk0.raw failed [2012-06-07 12:03:32 30425] ERROR (XendDomain:1200) Restore failed Traceback (most recent call last): File "/usr/lib64/python2.7/site-packages/xen/xend/XendDomain.py", line 1184, in domain_restore_fd dominfo = XendCheckpoint.restore(self, fd, paused=paused, relocating=relocating) File "/usr/lib64/python2.7/site-packages/xen/xend/XendCheckpoint.py", line 422, in restore raise exn VmError: Device 51712 (vbd) could not be connected. losetup /dev/loop0 /home/elahe/xen/images/sles11/disk0.raw failed Elahe,> > Jan > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Sat, Jun 09, elahe shekuhi wrote:> VmError: Device 51712 (vbd) could not be connected. losetup /dev/loop0 /home/ > elahe/xen/images/sles11/disk0.raw failedIs the directory /home/elahe reachable from both hosts? Olaf
On Sat, Jun 9, 2012 at 9:47 AM, Olaf Hering <olaf@aepfle.de> wrote:> On Sat, Jun 09, elahe shekuhi wrote: > > > VmError: Device 51712 (vbd) could not be connected. losetup /dev/loop0 > /home/ > > elahe/xen/images/sles11/disk0.raw failed > > Is the directory /home/elahe reachable from both hosts? >Yes, it is using nfs4.> > Olaf >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel