I''ve been banging my head against a wall for a couple days now. Does
anyone know if live checkpointing (''xm save -c'') is currently
working in
3.4.x? I''ve now tried with 3.4.0 on OracleVM, 3.4.1 on CentOS 5.4 and
3.4.2 on OpenSolaris. Each platform gives me the same results. It
seems like the suspend works but does not release the devices so when
the resume runs, it freaks because the devices are already attached. I
don''t know enough about Xen to know if the devices are supposed to
remain attached (because it doesn''t destroy the domain) or not. Every
time I try to live checkpoint the VM winds up suspended and the only way
to bring it back to life is to run ''xm destroy'' on it and then
''xm
resume''. I''ll be happy to provide more logs if I''ve
leaving something
out. The following is on a OracleVM hypervisor (yes, OracleVM doesn''t
support checkpointing but the results are the same with vanilla Xen).
Also doesn''t matter if I use a file backend device for the disk or a
physical device or a file on an NFS share, same result.
Thanks,
Tom
[root@compute-01 ~]# rpm -qa | grep xen
xen-devel-3.4.0-0.0.23.el5
xen-tools-3.4.0-0.0.23.el5
xen-debugger-3.4.0-0.0.23.el5
xen-3.4.0-0.0.23.el5
xen-64-3.4.0-0.0.23.el5
[root@compute-01 ~]# uname -a
Linux compute-01.example.com 2.6.18-128.2.1.4.9.el5xen #1 SMP Fri Oct 9
14:57:31 EDT 2009 i686 i686 i386 GNU/Linux
[root@compute-01 ~]# cat /OVS/running_pool/1_ovm_pv_01_example_com/vm.cfg
bootargs = ''bridge=xenbr0,mac=00:16:3E:AA:EB:08,type=netfront''
bootloader = ''/usr/bin/pypxeboot''
disk = [''file:/tmp/System.img,xvda,w'']
maxmem = 512
memory = 512
name = ''1_ovm_pv_01_example_com''
on_crash = ''restart''
on_reboot = ''restart''
uuid = ''7408c627-3232-4c1d-b5e3-1cf05cb015c8''
vcpus = 1
vfb =
[''type=vnc,vncunused=1,vnclisten=0.0.0.0,vncpasswd=<removed>'']
vif = [''bridge=xenbr0,mac=00:16:3E:AA:EB:08,type=netfront'']
vif_other_config = []
xend.log
[2010-03-02 17:22:38 2840] DEBUG (XendCheckpoint:110) [xc_save]:
/usr/lib/xen/bin/xc_save 43 6 0 0 0
[2010-03-02 17:22:38 2840] INFO (XendCheckpoint:418) xc_save: failed to
get the suspend evtchn port
[2010-03-02 17:22:38 2840] INFO (XendCheckpoint:418)
[2010-03-02 17:22:38 2840] DEBUG (XendCheckpoint:389) suspend
[2010-03-02 17:22:38 2840] DEBUG (XendCheckpoint:113) In
saveInputHandler suspend
[2010-03-02 17:22:38 2840] DEBUG (XendCheckpoint:115) Suspending 6 ...
[2010-03-02 17:22:38 2840] DEBUG (XendDomainInfo:520)
XendDomainInfo.shutdown(suspend)
[2010-03-02 17:22:38 2840] DEBUG (XendDomainInfo:1727)
XendDomainInfo.handleShutdownWatch
[2010-03-02 17:22:38 2840] DEBUG (XendDomainInfo:1727)
XendDomainInfo.handleShutdownWatch
[2010-03-02 17:22:38 2840] INFO (XendDomainInfo:1915) Domain has
shutdown: name=migrating-1_ovm_pv_01_example_com id=6 reason=suspend.
[2010-03-02 17:22:38 2840] INFO (XendCheckpoint:121) Domain 6 suspended.
[2010-03-02 17:22:38 2840] DEBUG (XendCheckpoint:130) Written done
[2010-03-02 17:22:38 2840] INFO (XendCheckpoint:418) Had 0 unexplained
entries in p2m table
[2010-03-02 17:22:46 2840] INFO (XendCheckpoint:418) Saving memory
pages: iter 1 0%^H^H^H^H 5%^H^H^H^H 10%^H^H^H^H 15%^H^H^H^H
20%^H^H^H^H 25%^H^H^H^H 30%^H^H^H^H 35%^H^H^H^H 40%^H^H^H^H 45%^H^H^H^H
50%^H^H^H^H 55%^H^H^H^H 60%^H^H^H^H 65%^H^H^H^H 70%^H^H^H^H 75%^H^H^H^H
80%^H^H^H^H 85%^H^H^H^H 90%^H^H^H^H 95%^M 1: sent 131072, skipped 0,
delta 8194ms, dom0 17%, target 0%, sent 524Mb/s, dirtied 0Mb/s 0 pages
[2010-03-02 17:22:46 2840] INFO (XendCheckpoint:418) Total pages sent=
131072 (0.98x)
[2010-03-02 17:22:46 2840] INFO (XendCheckpoint:418) (of which 0 were
fixups)
[2010-03-02 17:22:46 2840] INFO (XendCheckpoint:418) All memory is saved
[2010-03-02 17:22:47 2840] INFO (XendCheckpoint:418) Save exit rc=0
[2010-03-02 17:22:47 2840] DEBUG (XendDomainInfo:2804)
XendDomainInfo.resumeDomain(6)
[2010-03-02 17:22:47 2840] DEBUG (XendDomainInfo:2221) Destroying device
model
[2010-03-02 17:22:47 2840] INFO (image:553)
migrating-1_ovm_pv_01_example_com device model terminated
[2010-03-02 17:22:47 2840] DEBUG (XendDomainInfo:2228) Releasing devices
[2010-03-02 17:22:47 2840] DEBUG (XendDomainInfo:2241) Removing vif/0
[2010-03-02 17:22:47 2840] DEBUG (XendDomainInfo:1144)
XendDomainInfo.destroyDevice: deviceClass = vif, device = vif/0
[2010-03-02 17:22:47 2840] DEBUG (XendDomainInfo:2241) Removing vbd/51712
[2010-03-02 17:22:47 2840] DEBUG (XendDomainInfo:1144)
XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/51712
[2010-03-02 17:22:47 2840] DEBUG (XendDomainInfo:2241) Removing vkbd/0
[2010-03-02 17:22:47 2840] DEBUG (XendDomainInfo:1144)
XendDomainInfo.destroyDevice: deviceClass = vkbd, device = vkbd/0
[2010-03-02 17:22:47 2840] DEBUG (XendDomainInfo:2241) Removing vfb/0
[2010-03-02 17:22:47 2840] DEBUG (XendDomainInfo:1144)
XendDomainInfo.destroyDevice: deviceClass = vfb, device = vfb/0
[2010-03-02 17:22:47 2840] DEBUG (XendDomainInfo:2241) Removing console/0
[2010-03-02 17:22:47 2840] DEBUG (XendDomainInfo:1144)
XendDomainInfo.destroyDevice: deviceClass = console, device = console/0
[2010-03-02 17:22:47 2840] INFO (XendDomainInfo:3028) Dev 51712 still
active, looping...
[2010-03-02 17:22:47 2840] INFO (XendDomainInfo:3028) Dev 51712 still
active, looping...
<many>
[2010-03-02 17:23:17 2840] INFO (XendDomainInfo:3028) Dev 51712 still
active, looping...
[2010-03-02 17:23:17 2840] INFO (XendDomainInfo:3034) Dev still active
but hit max loop timeout
[2010-03-02 17:23:17 2840] INFO (XendDomainInfo:3047) Dev 0 still
active, looping...
[2010-03-02 17:23:17 2840] INFO (XendDomainInfo:3047) Dev 0 still
active, looping...
<many>
[2010-03-02 17:23:47 2840] INFO (XendDomainInfo:3047) Dev 0 still
active, looping...
[2010-03-02 17:23:47 2840] INFO (XendDomainInfo:3053) Dev still active
but hit max loop timeout
[2010-03-02 17:23:47 2840] DEBUG (XendDomainInfo:2826)
XendDomainInfo.resumeDomain: devices released
[2010-03-02 17:23:47 2840] DEBUG (XendDomainInfo:1727)
XendDomainInfo.handleShutdownWatch
[2010-03-02 17:23:47 2840] DEBUG (XendDomainInfo:1640) Storing domain
details: {''console/ring-ref'': ''1211263'',
''image/entry'': ''2149580800'',
''console/port'': ''2'',
''store/ring-ref'': ''1211264'',
''image/loader'':
''generic'', ''vm'':
''/vm/b9efadc3-3dc5-4c8b-bb32-27e3f6217ff3'',
''control/platform-feature-multiprocessor-suspend'':
''1'',
''image/guest-os'': ''linux'',
''image/features/writable-descriptor-tables'':
''1'', ''image/virt-base'':
''2147483648'', ''memory/target'':
''524288'',
''image/guest-version'': ''2.6'',
''image/features/supervisor-mode-kernel'':
''1'', ''console/limit'':
''1048576'', ''image/paddr-offset'':
''2147483648'',
''image/hypercall-page'': ''2149605376'',
''cpu/0/availability'': ''online'',
''image/features/pae-pgdir-above-4gb'': ''1'',
''image/features/writable-page-tables'': ''1'',
''console/type'': ''ioemu'',
''image/features/auto-translated-physmap'':
''1'', ''name'':
''migrating-1_ovm_pv_01_example_com'',
''domid'': ''6'',
''image/xen-version'':
''xen-3.0'', ''store/port'':
''1''}
[2010-03-02 17:23:47 2840] INFO (XendDomainInfo:2180) createDevice: vkbd
: {''devid'': 0, ''uuid'':
''89b96740-8d56-e9a6-4a3b-cbddf1810bf1''}
[2010-03-02 17:23:47 2840] DEBUG (DevController:95) DevController:
writing {''protocol'': ''x86_64-abi'',
''state'': ''1'',
''backend-id'': ''0'',
''backend'':
''/local/domain/0/backend/vkbd/6/0''} to
/local/domain/6/device/vkbd/0.
[2010-03-02 17:23:47 2840] DEBUG (DevController:97) DevController:
writing {''frontend-id'': ''6'',
''domain'':
''migrating-1_ovm_pv_01_example_com'',
''frontend'':
''/local/domain/6/device/vkbd/0'', ''state'':
''1'', ''online'': ''1''} to
/local/domain/0/backend/vkbd/6/0.
[2010-03-02 17:23:47 2840] INFO (XendDomainInfo:2180) createDevice: vfb
: {''vncunused'': ''1'',
''other_config'': {''vncunused'':
''1'', ''vncpasswd'':
''XXXXXXXX'', ''vnclisten'':
''0.0.0.0'', ''vnc'': ''1'',
''xauthority'':
''/root/.Xauthority''}, ''vnc'':
''1'', ''xauthority'':
''/root/.Xauthority'',
''vnclisten'': ''0.0.0.0'',
''vncpasswd'': ''XXXXXXXX'',
''location'':
''0.0.0.0:5900'', ''devid'': 0,
''uuid'':
''3f989332-a2f2-5a41-1688-b460d3ac8192''}
[2010-03-02 17:23:47 2840] DEBUG (DevController:95) DevController:
writing {''protocol'': ''x86_64-abi'',
''state'': ''1'',
''backend-id'': ''0'',
''backend'':
''/local/domain/0/backend/vfb/6/0''} to
/local/domain/6/device/vfb/0.
[2010-03-02 17:23:47 2840] DEBUG (DevController:97) DevController:
writing {''vncunused'': ''1'',
''domain'':
''migrating-1_ovm_pv_01_example_com'',
''frontend'':
''/local/domain/6/device/vfb/0'',
''xauthority'': ''/root/.Xauthority'',
''frontend-id'': ''6'',
''vnclisten'': ''0.0.0.0'',
''vncpasswd'': ''XXXXXXXX'',
''state'': ''1'', ''location'':
''0.0.0.0:5900'', ''online'':
''1'', ''vnc'': ''1'',
''uuid'':
''3f989332-a2f2-5a41-1688-b460d3ac8192''} to
/local/domain/0/backend/vfb/6/0.
[2010-03-02 17:23:47 2840] INFO (XendDomainInfo:2180) createDevice:
console : {''location'': ''2'',
''devid'': 0, ''protocol'':
''vt100'', ''uuid'':
''6ce7f874-5cdf-d038-3a1d-27ad1baa3497'',
''other_config'': {}}
[2010-03-02 17:23:47 2840] DEBUG (DevController:95) DevController:
writing {''protocol'': ''x86_64-abi'',
''state'': ''1'',
''backend-id'': ''0'',
''backend'':
''/local/domain/0/backend/console/6/1''} to
/local/domain/6/device/console/1.
[2010-03-02 17:23:47 2840] DEBUG (DevController:97) DevController:
writing {''domain'':
''migrating-1_ovm_pv_01_example_com'',
''frontend'':
''/local/domain/6/device/console/1'', ''uuid'':
''6ce7f874-5cdf-d038-3a1d-27ad1baa3497'',
''frontend-id'': ''6'',
''state'':
''1'', ''location'': ''2'',
''online'': ''1'', ''protocol'':
''vt100''} to
/local/domain/0/backend/console/6/1.
[2010-03-02 17:23:47 2840] INFO (XendDomainInfo:2180) createDevice: vbd
: {''uuid'':
''56336c23-4848-c780-3dcd-fa0305797f25'',
''bootable'': 1,
''devid'': 51712, ''driver'':
''paravirtualised'', ''dev'':
''xvda'', ''uname'':
''file:/tmp/System.img'', ''mode'':
''w''}
[2010-03-02 17:23:47 2840] ERROR (XendDomainInfo:2843)
XendDomainInfo.resume: xc.domain_resume failed on domain 6.
Traceback (most recent call last):
File "/usr/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py",
line 2837, in resumeDomain
self._createDevices()
File "/usr/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py",
line 2182, in _createDevices
devid = self._createDevice(devclass, config)
File "/usr/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py",
line 2149, in _createDevice
return self.getDeviceController(deviceClass).createDevice(devConfig)
File
"/usr/lib/python2.4/site-packages/xen/xend/server/DevController.py",
line 91, in createDevice
raise VmError("Device %s is already connected." % dev_str)
VmError: Device xvda (51712, vbd) is already connected.
[2010-03-02 17:23:47 2840] DEBUG (XendDomainInfo:2845)
XendDomainInfo.resumeDomain: completed
_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users
Pasi Kärkkäinen
2010-Mar-04 07:10 UTC
Re: [Xen-users] Live checkpointing not working in 3.4.x?
On Thu, Mar 04, 2010 at 12:12:56AM -0600, Tom Verbiscer wrote:> I''ve been banging my head against a wall for a couple days now. Does > anyone know if live checkpointing (''xm save -c'') is currently working in > 3.4.x? I''ve now tried with 3.4.0 on OracleVM, 3.4.1 on CentOS 5.4 and > 3.4.2 on OpenSolaris. Each platform gives me the same results. It > seems like the suspend works but does not release the devices so when > the resume runs, it freaks because the devices are already attached. I > don''t know enough about Xen to know if the devices are supposed to > remain attached (because it doesn''t destroy the domain) or not. Every > time I try to live checkpoint the VM winds up suspended and the only way > to bring it back to life is to run ''xm destroy'' on it and then ''xm > resume''. I''ll be happy to provide more logs if I''ve leaving something > out. The following is on a OracleVM hypervisor (yes, OracleVM doesn''t > support checkpointing but the results are the same with vanilla Xen). > Also doesn''t matter if I use a file backend device for the disk or a > physical device or a file on an NFS share, same result. >does normal "xm save" and then "xm restore" work for you? What''s the guest kernel version? save/restore heavily depends on the guest kernel version/features (for pv guests). -- Pasi> Thanks, > Tom > > [root@compute-01 ~]# rpm -qa | grep xen > xen-devel-3.4.0-0.0.23.el5 > xen-tools-3.4.0-0.0.23.el5 > xen-debugger-3.4.0-0.0.23.el5 > xen-3.4.0-0.0.23.el5 > xen-64-3.4.0-0.0.23.el5 > [root@compute-01 ~]# uname -a > Linux compute-01.example.com 2.6.18-128.2.1.4.9.el5xen #1 SMP Fri Oct 9 > 14:57:31 EDT 2009 i686 i686 i386 GNU/Linux > > [root@compute-01 ~]# cat /OVS/running_pool/1_ovm_pv_01_example_com/vm.cfg > bootargs = ''bridge=xenbr0,mac=00:16:3E:AA:EB:08,type=netfront'' > bootloader = ''/usr/bin/pypxeboot'' > disk = [''file:/tmp/System.img,xvda,w''] > maxmem = 512 > memory = 512 > name = ''1_ovm_pv_01_example_com'' > on_crash = ''restart'' > on_reboot = ''restart'' > uuid = ''7408c627-3232-4c1d-b5e3-1cf05cb015c8'' > vcpus = 1 > vfb = [''type=vnc,vncunused=1,vnclisten=0.0.0.0,vncpasswd=<removed>''] > vif = [''bridge=xenbr0,mac=00:16:3E:AA:EB:08,type=netfront''] > vif_other_config = [] >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Tom Verbiscer
2010-Mar-04 14:05 UTC
Re: [Xen-users] Live checkpointing not working in 3.4.x?
Normal ''xm save'' and ''xm restore'' works just fine. My PV guest kernel is: [root@ovm-pv-01 ~]# uname -a Linux ovm-pv-01.example.com 2.6.18-164.el5xen #1 SMP Thu Sep 3 04:41:04 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux Thanks, Tom Pasi Kärkkäinen wrote:> On Thu, Mar 04, 2010 at 12:12:56AM -0600, Tom Verbiscer wrote: > >> I''ve been banging my head against a wall for a couple days now. Does >> anyone know if live checkpointing (''xm save -c'') is currently working in >> 3.4.x? I''ve now tried with 3.4.0 on OracleVM, 3.4.1 on CentOS 5.4 and >> 3.4.2 on OpenSolaris. Each platform gives me the same results. It >> seems like the suspend works but does not release the devices so when >> the resume runs, it freaks because the devices are already attached. I >> don''t know enough about Xen to know if the devices are supposed to >> remain attached (because it doesn''t destroy the domain) or not. Every >> time I try to live checkpoint the VM winds up suspended and the only way >> to bring it back to life is to run ''xm destroy'' on it and then ''xm >> resume''. I''ll be happy to provide more logs if I''ve leaving something >> out. The following is on a OracleVM hypervisor (yes, OracleVM doesn''t >> support checkpointing but the results are the same with vanilla Xen). >> Also doesn''t matter if I use a file backend device for the disk or a >> physical device or a file on an NFS share, same result. >> >> > > does normal "xm save" and then "xm restore" work for you? > > What''s the guest kernel version? save/restore heavily depends on the > guest kernel version/features (for pv guests). > > -- Pasi > > >> Thanks, >> Tom >> >> [root@compute-01 ~]# rpm -qa | grep xen >> xen-devel-3.4.0-0.0.23.el5 >> xen-tools-3.4.0-0.0.23.el5 >> xen-debugger-3.4.0-0.0.23.el5 >> xen-3.4.0-0.0.23.el5 >> xen-64-3.4.0-0.0.23.el5 >> [root@compute-01 ~]# uname -a >> Linux compute-01.example.com 2.6.18-128.2.1.4.9.el5xen #1 SMP Fri Oct 9 >> 14:57:31 EDT 2009 i686 i686 i386 GNU/Linux >> >> [root@compute-01 ~]# cat /OVS/running_pool/1_ovm_pv_01_example_com/vm.cfg >> bootargs = ''bridge=xenbr0,mac=00:16:3E:AA:EB:08,type=netfront'' >> bootloader = ''/usr/bin/pypxeboot'' >> disk = [''file:/tmp/System.img,xvda,w''] >> maxmem = 512 >> memory = 512 >> name = ''1_ovm_pv_01_example_com'' >> on_crash = ''restart'' >> on_reboot = ''restart'' >> uuid = ''7408c627-3232-4c1d-b5e3-1cf05cb015c8'' >> vcpus = 1 >> vfb = [''type=vnc,vncunused=1,vnclisten=0.0.0.0,vncpasswd=<removed>''] >> vif = [''bridge=xenbr0,mac=00:16:3E:AA:EB:08,type=netfront''] >> vif_other_config = [] >> >> > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Pasi Kärkkäinen
2010-Mar-04 14:07 UTC
Re: [Xen-users] Live checkpointing not working in 3.4.x?
On Thu, Mar 04, 2010 at 08:05:24AM -0600, Tom Verbiscer wrote:> Normal ''xm save'' and ''xm restore'' works just fine. My PV guest kernel is: >Ok.> [root@ovm-pv-01 ~]# uname -a > Linux ovm-pv-01.example.com 2.6.18-164.el5xen #1 SMP Thu Sep 3 04:41:04 > EDT 2009 x86_64 x86_64 x86_64 GNU/Linux >This kernel should be OK. You should update to the latest hotfix release though (-164.something) Does it work with the default Xen 3.1.2 that comes with RHEL5? -- Pasi> Thanks, > Tom > > > Pasi Kärkkäinen wrote: >> On Thu, Mar 04, 2010 at 12:12:56AM -0600, Tom Verbiscer wrote: >> >>> I''ve been banging my head against a wall for a couple days now. Does >>> anyone know if live checkpointing (''xm save -c'') is currently >>> working in 3.4.x? I''ve now tried with 3.4.0 on OracleVM, 3.4.1 on >>> CentOS 5.4 and 3.4.2 on OpenSolaris. Each platform gives me the >>> same results. It seems like the suspend works but does not release >>> the devices so when the resume runs, it freaks because the devices >>> are already attached. I don''t know enough about Xen to know if the >>> devices are supposed to remain attached (because it doesn''t destroy >>> the domain) or not. Every time I try to live checkpoint the VM >>> winds up suspended and the only way to bring it back to life is to >>> run ''xm destroy'' on it and then ''xm resume''. I''ll be happy to >>> provide more logs if I''ve leaving something out. The following is >>> on a OracleVM hypervisor (yes, OracleVM doesn''t support >>> checkpointing but the results are the same with vanilla Xen). Also >>> doesn''t matter if I use a file backend device for the disk or a >>> physical device or a file on an NFS share, same result. >>> >>> >> >> does normal "xm save" and then "xm restore" work for you? >> >> What''s the guest kernel version? save/restore heavily depends on the >> guest kernel version/features (for pv guests). >> >> -- Pasi >> >> >>> Thanks, >>> Tom >>> >>> [root@compute-01 ~]# rpm -qa | grep xen >>> xen-devel-3.4.0-0.0.23.el5 >>> xen-tools-3.4.0-0.0.23.el5 >>> xen-debugger-3.4.0-0.0.23.el5 >>> xen-3.4.0-0.0.23.el5 >>> xen-64-3.4.0-0.0.23.el5 >>> [root@compute-01 ~]# uname -a >>> Linux compute-01.example.com 2.6.18-128.2.1.4.9.el5xen #1 SMP Fri Oct >>> 9 14:57:31 EDT 2009 i686 i686 i386 GNU/Linux >>> >>> [root@compute-01 ~]# cat /OVS/running_pool/1_ovm_pv_01_example_com/vm.cfg >>> bootargs = ''bridge=xenbr0,mac=00:16:3E:AA:EB:08,type=netfront'' >>> bootloader = ''/usr/bin/pypxeboot'' >>> disk = [''file:/tmp/System.img,xvda,w''] >>> maxmem = 512 >>> memory = 512 >>> name = ''1_ovm_pv_01_example_com'' >>> on_crash = ''restart'' >>> on_reboot = ''restart'' >>> uuid = ''7408c627-3232-4c1d-b5e3-1cf05cb015c8'' >>> vcpus = 1 >>> vfb = [''type=vnc,vncunused=1,vnclisten=0.0.0.0,vncpasswd=<removed>''] >>> vif = [''bridge=xenbr0,mac=00:16:3E:AA:EB:08,type=netfront''] >>> vif_other_config = [] >>> >>> >> >> >> _______________________________________________ >> Xen-users mailing list >> Xen-users@lists.xensource.com >> http://lists.xensource.com/xen-users >>_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Tom Verbiscer
2010-Mar-05 08:04 UTC
Re: [Xen-users] Live checkpointing not working in 3.4.x?
Just tried it on 2.6.18-164.11.1.el5xen and it has the same issue.
As far as using the default RHEL5 stuff, CentOS at least only comes with
Xen 3.0.3 which doesn''t support live checkpointing.
I tried it with Xen 3.4.0 and a Fedora 9 domU just to try it out with a
somewhat recent domU kernel. When I did that, the command (''xm save -c
<domain#> <file>'') hung. The VM never suspended, it just
kept running.
Any ideas?
Thanks much,
Tom
Pasi Kärkkäinen wrote:> On Thu, Mar 04, 2010 at 08:05:24AM -0600, Tom Verbiscer wrote:
>
>> Normal ''xm save'' and ''xm restore''
works just fine. My PV guest kernel is:
>>
>>
>
> Ok.
>
>
>> [root@ovm-pv-01 ~]# uname -a
>> Linux ovm-pv-01.example.com 2.6.18-164.el5xen #1 SMP Thu Sep 3 04:41:04
>> EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
>>
>>
>
> This kernel should be OK. You should update to the latest hotfix release
though (-164.something)
>
> Does it work with the default Xen 3.1.2 that comes with RHEL5?
>
> -- Pasi
>
>
>> Thanks,
>> Tom
>>
>>
>> Pasi Kärkkäinen wrote:
>>
>>> On Thu, Mar 04, 2010 at 12:12:56AM -0600, Tom Verbiscer wrote:
>>>
>>>
>>>> I''ve been banging my head against a wall for a couple
days now. Does
>>>> anyone know if live checkpointing (''xm save
-c'') is currently
>>>> working in 3.4.x? I''ve now tried with 3.4.0 on
OracleVM, 3.4.1 on
>>>> CentOS 5.4 and 3.4.2 on OpenSolaris. Each platform gives me
the
>>>> same results. It seems like the suspend works but does not
release
>>>> the devices so when the resume runs, it freaks because the
devices
>>>> are already attached. I don''t know enough about Xen
to know if the
>>>> devices are supposed to remain attached (because it
doesn''t destroy
>>>> the domain) or not. Every time I try to live checkpoint the
VM
>>>> winds up suspended and the only way to bring it back to life
is to
>>>> run ''xm destroy'' on it and then ''xm
resume''. I''ll be happy to
>>>> provide more logs if I''ve leaving something out. The
following is
>>>> on a OracleVM hypervisor (yes, OracleVM doesn''t
support
>>>> checkpointing but the results are the same with vanilla Xen).
Also
>>>> doesn''t matter if I use a file backend device for the
disk or a
>>>> physical device or a file on an NFS share, same result.
>>>>
>>>>
>>>>
>>> does normal "xm save" and then "xm restore"
work for you?
>>>
>>> What''s the guest kernel version? save/restore heavily
depends on the
>>> guest kernel version/features (for pv guests).
>>>
>>> -- Pasi
>>>
>>>
>>>
>>>> Thanks,
>>>> Tom
>>>>
>>>> [root@compute-01 ~]# rpm -qa | grep xen
>>>> xen-devel-3.4.0-0.0.23.el5
>>>> xen-tools-3.4.0-0.0.23.el5
>>>> xen-debugger-3.4.0-0.0.23.el5
>>>> xen-3.4.0-0.0.23.el5
>>>> xen-64-3.4.0-0.0.23.el5
>>>> [root@compute-01 ~]# uname -a
>>>> Linux compute-01.example.com 2.6.18-128.2.1.4.9.el5xen #1 SMP
Fri Oct
>>>> 9 14:57:31 EDT 2009 i686 i686 i386 GNU/Linux
>>>>
>>>> [root@compute-01 ~]# cat
/OVS/running_pool/1_ovm_pv_01_example_com/vm.cfg
>>>> bootargs =
''bridge=xenbr0,mac=00:16:3E:AA:EB:08,type=netfront''
>>>> bootloader = ''/usr/bin/pypxeboot''
>>>> disk = [''file:/tmp/System.img,xvda,w'']
>>>> maxmem = 512
>>>> memory = 512
>>>> name = ''1_ovm_pv_01_example_com''
>>>> on_crash = ''restart''
>>>> on_reboot = ''restart''
>>>> uuid = ''7408c627-3232-4c1d-b5e3-1cf05cb015c8''
>>>> vcpus = 1
>>>> vfb =
[''type=vnc,vncunused=1,vnclisten=0.0.0.0,vncpasswd=<removed>'']
>>>> vif =
[''bridge=xenbr0,mac=00:16:3E:AA:EB:08,type=netfront'']
>>>> vif_other_config = []
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> Xen-users mailing list
>>> Xen-users@lists.xensource.com
>>> http://lists.xensource.com/xen-users
>>>
>>>
>
> _______________________________________________
> Xen-users mailing list
> Xen-users@lists.xensource.com
> http://lists.xensource.com/xen-users
>
_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users
Pasi Kärkkäinen
2010-Mar-05 08:17 UTC
Re: [Xen-users] Live checkpointing not working in 3.4.x?
On Fri, Mar 05, 2010 at 02:04:19AM -0600, Tom Verbiscer wrote:> Just tried it on 2.6.18-164.11.1.el5xen and it has the same issue. > > As far as using the default RHEL5 stuff, CentOS at least only comes with > Xen 3.0.3 which doesn''t support live checkpointing. >RHEL5/CentOS5 has Xen *hypervisor* version 3.1.2 + a lot of patches from Redhat. You can verify that with "xm info" - see the xen_major,xen_minor,xen_extra fields. Just the xen management tools are based on 3.0.3 in EL5.> I tried it with Xen 3.4.0 and a Fedora 9 domU just to try it out with a > somewhat recent domU kernel. When I did that, the command (''xm save -c > <domain#> <file>'') hung. The VM never suspended, it just kept running. > > Any ideas? >Maybe try with the latest http://xenbits.xen.org/xen-3.4-testing.hg (3.4.3-rc3). -- Pasi> Thanks much, > Tom > > > > Pasi Kärkkäinen wrote: >> On Thu, Mar 04, 2010 at 08:05:24AM -0600, Tom Verbiscer wrote: >> >>> Normal ''xm save'' and ''xm restore'' works just fine. My PV guest kernel is: >>> >>> >> >> Ok. >> >> >>> [root@ovm-pv-01 ~]# uname -a >>> Linux ovm-pv-01.example.com 2.6.18-164.el5xen #1 SMP Thu Sep 3 >>> 04:41:04 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux >>> >>> >> >> This kernel should be OK. You should update to the latest hotfix release though (-164.something) >> >> Does it work with the default Xen 3.1.2 that comes with RHEL5? >> >> -- Pasi >> >> >>> Thanks, >>> Tom >>> >>> >>> Pasi Kärkkäinen wrote: >>> >>>> On Thu, Mar 04, 2010 at 12:12:56AM -0600, Tom Verbiscer wrote: >>>> >>>>> I''ve been banging my head against a wall for a couple days now. >>>>> Does anyone know if live checkpointing (''xm save -c'') is >>>>> currently working in 3.4.x? I''ve now tried with 3.4.0 on >>>>> OracleVM, 3.4.1 on CentOS 5.4 and 3.4.2 on OpenSolaris. Each >>>>> platform gives me the same results. It seems like the suspend >>>>> works but does not release the devices so when the resume runs, >>>>> it freaks because the devices are already attached. I don''t >>>>> know enough about Xen to know if the devices are supposed to >>>>> remain attached (because it doesn''t destroy the domain) or not. >>>>> Every time I try to live checkpoint the VM winds up suspended >>>>> and the only way to bring it back to life is to run ''xm destroy'' >>>>> on it and then ''xm resume''. I''ll be happy to provide more logs >>>>> if I''ve leaving something out. The following is on a OracleVM >>>>> hypervisor (yes, OracleVM doesn''t support checkpointing but the >>>>> results are the same with vanilla Xen). Also doesn''t matter if >>>>> I use a file backend device for the disk or a physical device >>>>> or a file on an NFS share, same result. >>>>> >>>>> >>>> does normal "xm save" and then "xm restore" work for you? >>>> >>>> What''s the guest kernel version? save/restore heavily depends on >>>> the guest kernel version/features (for pv guests). >>>> >>>> -- Pasi >>>> >>>> >>>>> Thanks, >>>>> Tom >>>>> >>>>> [root@compute-01 ~]# rpm -qa | grep xen >>>>> xen-devel-3.4.0-0.0.23.el5 >>>>> xen-tools-3.4.0-0.0.23.el5 >>>>> xen-debugger-3.4.0-0.0.23.el5 >>>>> xen-3.4.0-0.0.23.el5 >>>>> xen-64-3.4.0-0.0.23.el5 >>>>> [root@compute-01 ~]# uname -a >>>>> Linux compute-01.example.com 2.6.18-128.2.1.4.9.el5xen #1 SMP Fri >>>>> Oct 9 14:57:31 EDT 2009 i686 i686 i386 GNU/Linux >>>>> >>>>> [root@compute-01 ~]# cat /OVS/running_pool/1_ovm_pv_01_example_com/vm.cfg >>>>> bootargs = ''bridge=xenbr0,mac=00:16:3E:AA:EB:08,type=netfront'' >>>>> bootloader = ''/usr/bin/pypxeboot'' >>>>> disk = [''file:/tmp/System.img,xvda,w''] >>>>> maxmem = 512 >>>>> memory = 512 >>>>> name = ''1_ovm_pv_01_example_com'' >>>>> on_crash = ''restart'' >>>>> on_reboot = ''restart'' >>>>> uuid = ''7408c627-3232-4c1d-b5e3-1cf05cb015c8'' >>>>> vcpus = 1 >>>>> vfb = [''type=vnc,vncunused=1,vnclisten=0.0.0.0,vncpasswd=<removed>''] >>>>> vif = [''bridge=xenbr0,mac=00:16:3E:AA:EB:08,type=netfront''] >>>>> vif_other_config = [] >>>>> >>>>> >>>> _______________________________________________ >>>> Xen-users mailing list >>>> Xen-users@lists.xensource.com >>>> http://lists.xensource.com/xen-users >>>> >> >> _______________________________________________ >> Xen-users mailing list >> Xen-users@lists.xensource.com >> http://lists.xensource.com/xen-users >>_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users