Hello, I have been struggling through the task of moving our infrastructure over to Xen VMs. We were initially using Ubuntu packages for both dom0 and our domUs, but experienced extreme instability so we moved to CentOS, which has been much more reliable for dom0. Since we already had a bunch of Ubuntu VMs, we left them using the Ubuntu 2.4.24-19-xen kernel, but this has turned out to be a mistake -- we get frequent kernel oopses during heavy disk I/O. We modified the kernel to add NFS-root support, but that is the only change we made to the original config. All of our domUs mount their root file systems over NFS. My problem is that I tried to upgrade the domU kernels to the latest kernel.org stable release (2.6.26.5) and did manage to get it working after some initial trouble (TCP checksum offloading was breaking NFS). However, the new kernel will not live migrate anymore. When I execute the live migrate command: # xm migrate --live testvm 192.168.1.20 Migration hangs forever. The VM changes name to "migrate-testvm" and keeps running normally on the system it was on, and appears as "testvm" with state "-br---" on the destination machine with 0 CPU time. I left tcpdump running on the destination machine and captured an 84MB pcap file which looked pretty normal up until all traffic just completely stopped. If I just change the "kernel=" line in the config script to the Ubuntu kernel migration works again. Here''s my VM configuration: ------------------- name = ''testvm'' kernel = ''/xen_vm/global/kernels/vmlinuz-2.6.26.5'' ramdisk = ''/xen_vm/global/kernels/initrd.img-xen-latest'' memory = ''256'' disk = [''tap:aio:/xen_vm/global/swaps/testvm.img,xvda1,w''] vif = [ ''mac=00:16:3e:5b:8d:5d,bridge=xenbr0'', ''mac=00:16:3e:99:9b:e7,bridge=xenbr1'' ] on_poweroff = ''destroy'' on_reboot = ''restart'' on_crash = ''restart'' extra = ''2 console=hvc0 root=/dev/nfs ip=:192.168.1.12::::eth1:'' nfs_server = ''192.168.1.12'' nfs_root = ''/xen_vm/testvm'' ------------------- xend.log on source: ------------------- [2008-09-18 15:51:11 xend 3751] DEBUG (balloon:127) Balloon: 786956 KiB free; need 2048; done. [2008-09-18 15:51:11 xend 3751] DEBUG (XendCheckpoint:89) [xc_save]: /usr/lib/xen/bin/xc_save 33 38 0 0 1 ------------------- xend.log on destination: ------------------- ... [2008-09-18 15:51:11 xend.XendDomainInfo 3331] DEBUG (XendDomainInfo:1350) XendDomainInfo.construct: None [2008-09-18 15:51:11 xend 3331] DEBUG (balloon:127) Balloon: 262832 KiB free; need 2048; done. ... [2008-09-18 15:51:11 xend 3331] DEBUG (blkif:24) exception looking up device number for xvda1: [Errno 2] No such file or directory: ''/dev/xvda1'' [2008-09-18 15:51:11 xend 3331] DEBUG (DevController:110) DevController: writing {''backend-id'': ''0'', ''virtual-device'': ''51713'', ''device-type'': ''disk'', ''state'': ''1'', ''backend'': ''/local/domain/0/backend/tap/10/51713''} to /local/domain/10/device/vbd/51713. ... [2008-09-18 15:51:12 xend 3331] DEBUG (XendCheckpoint:198) restore:shadow=0x0, _static_max=0x100, _static_min=0x100, [2008-09-18 15:51:12 xend 3331] DEBUG (balloon:127) Balloon: 262832 KiB free; need 262144; done. [2008-09-18 15:51:12 xend 3331] DEBUG (XendCheckpoint:215) [xc_restore]: /usr/lib/xen/bin/xc_restore 24 10 1 2 0 0 0 ------------------- Xen version: xen-3.0-x86_32p dom0: 2.6.18-92.1.10.el5xen Anybody know what would cause this, or have any suggestions for tracking down the problem? I did find a post from someone who was seeing identical behavior who claimed he fixed it by enabling CPU Hotplug support, but I already have that enabled in the kernel. Thanks, Trevor _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi! Migrating, save, restore, etc. are scheduled for 2.6.27. Balloning is also broken in early 2.6.26. I don''t know if it is still broken. Even using more than one vcpu makes most domU kernels unrelianbe. The only reliable and fully-supported kernels seem to be the original xen.org kernels (2.8.18.8) both as dom0 and domU. I''ve had a lot of struggle related to tis topic. Cheers: Tamas PS: Take a look at the archives of the xen-devel mailing list at around Tue, 22 Jul 2008 18:11:26 -0600! Thread: [Xen-devel] PROBLEM: Xen ballon driver seems to be broken 2008. 09. 18, csütörtök keltezéssel 15.55-kor Trevor Bentley ezt írta:> Hello, > > I have been struggling through the task of moving our infrastructure > over to Xen VMs. We were initially using Ubuntu packages for both dom0 > and our domUs, but experienced extreme instability so we moved to > CentOS, which has been much more reliable for dom0. Since we already > had a bunch of Ubuntu VMs, we left them using the Ubuntu 2.4.24-19-xen > kernel, but this has turned out to be a mistake -- we get frequent > kernel oopses during heavy disk I/O. We modified the kernel to add > NFS-root support, but that is the only change we made to the original > config. All of our domUs mount their root file systems over NFS. > > My problem is that I tried to upgrade the domU kernels to the latest > kernel.org stable release (2.6.26.5) and did manage to get it working > after some initial trouble (TCP checksum offloading was breaking NFS). > However, the new kernel will not live migrate anymore. When I execute > the live migrate command: > > # xm migrate --live testvm 192.168.1.20 > > Migration hangs forever. The VM changes name to "migrate-testvm" and > keeps running normally on the system it was on, and appears as "testvm" > with state "-br---" on the destination machine with 0 CPU time. I left > tcpdump running on the destination machine and captured an 84MB pcap > file which looked pretty normal up until all traffic just completely > stopped. If I just change the "kernel=" line in the config script to > the Ubuntu kernel migration works again. > > Here''s my VM configuration: > ------------------- > name = ''testvm'' > kernel = ''/xen_vm/global/kernels/vmlinuz-2.6.26.5'' > ramdisk = ''/xen_vm/global/kernels/initrd.img-xen-latest'' > memory = ''256'' > disk = [''tap:aio:/xen_vm/global/swaps/testvm.img,xvda1,w''] > vif = [ > ''mac=00:16:3e:5b:8d:5d,bridge=xenbr0'', > ''mac=00:16:3e:99:9b:e7,bridge=xenbr1'' > ] > on_poweroff = ''destroy'' > on_reboot = ''restart'' > on_crash = ''restart'' > extra = ''2 console=hvc0 root=/dev/nfs ip=:192.168.1.12::::eth1:'' > nfs_server = ''192.168.1.12'' > nfs_root = ''/xen_vm/testvm'' > ------------------- > > > xend.log on source: > ------------------- > [2008-09-18 15:51:11 xend 3751] DEBUG (balloon:127) Balloon: 786956 KiB > free; need 2048; done. > [2008-09-18 15:51:11 xend 3751] DEBUG (XendCheckpoint:89) [xc_save]: > /usr/lib/xen/bin/xc_save 33 38 0 0 1 > ------------------- > > xend.log on destination: > ------------------- > ... > [2008-09-18 15:51:11 xend.XendDomainInfo 3331] DEBUG > (XendDomainInfo:1350) XendDomainInfo.construct: None > [2008-09-18 15:51:11 xend 3331] DEBUG (balloon:127) Balloon: 262832 KiB > free; need 2048; done. > ... > [2008-09-18 15:51:11 xend 3331] DEBUG (blkif:24) exception looking up > device number for xvda1: [Errno 2] No such file or directory: ''/dev/xvda1'' > [2008-09-18 15:51:11 xend 3331] DEBUG (DevController:110) DevController: > writing {''backend-id'': ''0'', ''virtual-device'': ''51713'', ''device-type'': > ''disk'', ''state'': ''1'', ''backend'': ''/local/domain/0/backend/tap/10/51713''} > to /local/domain/10/device/vbd/51713. > ... > [2008-09-18 15:51:12 xend 3331] DEBUG (XendCheckpoint:198) > restore:shadow=0x0, _static_max=0x100, _static_min=0x100, > [2008-09-18 15:51:12 xend 3331] DEBUG (balloon:127) Balloon: 262832 KiB > free; need 262144; done. > [2008-09-18 15:51:12 xend 3331] DEBUG (XendCheckpoint:215) [xc_restore]: > /usr/lib/xen/bin/xc_restore 24 10 1 2 0 0 0 > ------------------- > > > Xen version: xen-3.0-x86_32p > dom0: 2.6.18-92.1.10.el5xen > > Anybody know what would cause this, or have any suggestions for tracking > down the problem? I did find a post from someone who was seeing > identical behavior who claimed he fixed it by enabling CPU Hotplug > support, but I already have that enabled in the kernel. > > Thanks, > > Trevor > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Thiago Camargo Martins Cordeiro
2008-Sep-18 20:19 UTC
Re: [Xen-users] Migration stalls with 2.6.26.5 kernel
Trevor, Let me talk about my experience with distros that have supported Xen on their trees. The most stable packaged Xen-3.2 is on Debian Lenny with kernel 2.6.18-xen from Debian Etch. See http://wiki.debian.org/Xen - Installation on lenny. If your are "nutz", use Debian Lenny with Xen-3.3 and Linux-2.6.18.8-xen-3.3 from xen.org compiled by you. I have tried Ubuntu Hardy with kernel 2.6.24-21-xen and it ins''t stable at all, I can''t use nosmp with it, some aacraid bugs. My dom0 are all Ubuntu Hardy with Xen-3.3 and Linux 2.6.18.8-xen from xen.org, but now I have a problem with pythom-xml package. On Debian I do not see any bug or instability. That''s my opinion! ps.: Sorry about my english... ;-) Regards, Thiago 2008/9/18 Trevor Bentley <trevor.bentley@sherpasolutions.net>> Hello, > > I have been struggling through the task of moving our infrastructure over > to Xen VMs. We were initially using Ubuntu packages for both dom0 and our > domUs, but experienced extreme instability so we moved to CentOS, which has > been much more reliable for dom0. Since we already had a bunch of Ubuntu > VMs, we left them using the Ubuntu 2.4.24-19-xen kernel, but this has turned > out to be a mistake -- we get frequent kernel oopses during heavy disk I/O. > We modified the kernel to add NFS-root support, but that is the only change > we made to the original config. All of our domUs mount their root file > systems over NFS. > > My problem is that I tried to upgrade the domU kernels to the latest > kernel.org stable release (2.6.26.5) and did manage to get it working > after some initial trouble (TCP checksum offloading was breaking NFS). > However, the new kernel will not live migrate anymore. When I execute the > live migrate command: > > # xm migrate --live testvm 192.168.1.20 > > Migration hangs forever. The VM changes name to "migrate-testvm" and keeps > running normally on the system it was on, and appears as "testvm" with state > "-br---" on the destination machine with 0 CPU time. I left tcpdump running > on the destination machine and captured an 84MB pcap file which looked > pretty normal up until all traffic just completely stopped. If I just > change the "kernel=" line in the config script to the Ubuntu kernel > migration works again. > > Here''s my VM configuration: > ------------------- name = ''testvm'' > kernel = ''/xen_vm/global/kernels/vmlinuz-2.6.26.5'' > ramdisk = ''/xen_vm/global/kernels/initrd.img-xen-latest'' > memory = ''256'' > disk = [''tap:aio:/xen_vm/global/swaps/testvm.img,xvda1,w''] > vif = [ > ''mac=00:16:3e:5b:8d:5d,bridge=xenbr0'', > ''mac=00:16:3e:99:9b:e7,bridge=xenbr1'' > ] > on_poweroff = ''destroy'' > on_reboot = ''restart'' > on_crash = ''restart'' > extra = ''2 console=hvc0 root=/dev/nfs ip=:192.168.1.12::::eth1:'' > nfs_server = ''192.168.1.12'' > nfs_root = ''/xen_vm/testvm'' > ------------------- > > > xend.log on source: > ------------------- > [2008-09-18 15:51:11 xend 3751] DEBUG (balloon:127) Balloon: 786956 KiB > free; need 2048; done. > [2008-09-18 15:51:11 xend 3751] DEBUG (XendCheckpoint:89) [xc_save]: > /usr/lib/xen/bin/xc_save 33 38 0 0 1 > ------------------- > > xend.log on destination: > ------------------- > ... > [2008-09-18 15:51:11 xend.XendDomainInfo 3331] DEBUG (XendDomainInfo:1350) > XendDomainInfo.construct: None > [2008-09-18 15:51:11 xend 3331] DEBUG (balloon:127) Balloon: 262832 KiB > free; need 2048; done. > ... > [2008-09-18 15:51:11 xend 3331] DEBUG (blkif:24) exception looking up > device number for xvda1: [Errno 2] No such file or directory: ''/dev/xvda1'' > [2008-09-18 15:51:11 xend 3331] DEBUG (DevController:110) DevController: > writing {''backend-id'': ''0'', ''virtual-device'': ''51713'', ''device-type'': > ''disk'', ''state'': ''1'', ''backend'': ''/local/domain/0/backend/tap/10/51713''} to > /local/domain/10/device/vbd/51713. > ... > [2008-09-18 15:51:12 xend 3331] DEBUG (XendCheckpoint:198) > restore:shadow=0x0, _static_max=0x100, _static_min=0x100, > [2008-09-18 15:51:12 xend 3331] DEBUG (balloon:127) Balloon: 262832 KiB > free; need 262144; done. > [2008-09-18 15:51:12 xend 3331] DEBUG (XendCheckpoint:215) [xc_restore]: > /usr/lib/xen/bin/xc_restore 24 10 1 2 0 0 0 > ------------------- > > > Xen version: xen-3.0-x86_32p > dom0: 2.6.18-92.1.10.el5xen > > Anybody know what would cause this, or have any suggestions for tracking > down the problem? I did find a post from someone who was seeing identical > behavior who claimed he fixed it by enabling CPU Hotplug support, but I > already have that enabled in the kernel. > > Thanks, > > Trevor > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
"Nemeth, Tamas" <nice@titanic.nyme.hu> writes:> Migrating, save, restore, etc. are scheduled for 2.6.27.Yes, but the Debian 2.6.26 kernels have the necessary patches backported, so they migrate just fine. I don''t know how Ubuntu kernels relate to those of Debian, but would expect much similarity. Or simply use the Debian kernel. -- Regards, Feri. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Trevor Bentley
2008-Sep-19 13:48 UTC
Re: [Xen-users] Migration stalls with 2.6.26.5 kernel
It''s very unfortunate that the Xen is so picky about it''s domU kernels. I wonder how hardware providers based on Xen (like Virtual Iron) deal with this. It seems a shame if they have to give you specific kernels to use. Thanks for the Debian tip, though. I downloaded the binary linux-image-* and linux-module-* packages for Lenny, made an initrd with the modules, and that kernel boots and migrates. The first time I migrated it, though, migration failed and the domU died with a bunch of "WARNING: g.e. still in use!" errors, so once again I''m not confident about the stability. This leaves me with: Ubuntu kernel - NFS-root works, migration works, kernel unstable CentOS kernel - NFS-root broken (ignores no_root_squash) kernel.org - NFS-root works, migration broken, kernel stable Debian - NFS-root works, migration works, kernel possibly unstable It''s unfortunate that it is so much trouble to get this working. Xen is supposed to be making server administration easier, not harder! Hopefully it''ll stabilize soon. Thanks, Trevor Thiago Camargo Martins Cordeiro wrote:> Trevor, > > Let me talk about my experience with distros that have supported Xen > on their trees. > > The most stable packaged Xen-3.2 is on Debian Lenny with kernel > 2.6.18-xen from Debian Etch. See http://wiki.debian.org/Xen - > Installation on lenny. > > If your are "nutz", use Debian Lenny with Xen-3.3 and > Linux-2.6.18.8-xen-3.3 from xen.org <http://xen.org> compiled by you. > > I have tried Ubuntu Hardy with kernel 2.6.24-21-xen and it ins''t > stable at all, I can''t use nosmp with it, some aacraid bugs. My dom0 > are all Ubuntu Hardy with Xen-3.3 and Linux 2.6.18.8-xen from xen.org > <http://xen.org>, but now I have a problem with pythom-xml package. On > Debian I do not see any bug or instability. > > That''s my opinion! > > ps.: Sorry about my english... ;-) > > Regards, > Thiago > > 2008/9/18 Trevor Bentley <trevor.bentley@sherpasolutions.net > <mailto:trevor.bentley@sherpasolutions.net>> > > Hello, > > I have been struggling through the task of moving our > infrastructure over to Xen VMs. We were initially using Ubuntu > packages for both dom0 and our domUs, but experienced extreme > instability so we moved to CentOS, which has been much more > reliable for dom0. Since we already had a bunch of Ubuntu VMs, > we left them using the Ubuntu 2.4.24-19-xen kernel, but this has > turned out to be a mistake -- we get frequent kernel oopses during > heavy disk I/O. We modified the kernel to add NFS-root support, > but that is the only change we made to the original config. All > of our domUs mount their root file systems over NFS. > > My problem is that I tried to upgrade the domU kernels to the > latest kernel.org <http://kernel.org> stable release (2.6.26.5 > <http://2.6.26.5>) and did manage to get it working after some > initial trouble (TCP checksum offloading was breaking NFS). > However, the new kernel will not live migrate anymore. When I > execute the live migrate command: > > # xm migrate --live testvm 192.168.1.20 <http://192.168.1.20> > > Migration hangs forever. The VM changes name to "migrate-testvm" > and keeps running normally on the system it was on, and appears as > "testvm" with state "-br---" on the destination machine with 0 CPU > time. I left tcpdump running on the destination machine and > captured an 84MB pcap file which looked pretty normal up until all > traffic just completely stopped. If I just change the "kernel=" > line in the config script to the Ubuntu kernel migration works again. > > Here''s my VM configuration: > ------------------- name > ''testvm'' > kernel = ''/xen_vm/global/kernels/vmlinuz-2.6.26.5'' > ramdisk = ''/xen_vm/global/kernels/initrd.img-xen-latest'' > memory = ''256'' > disk = [''tap:aio:/xen_vm/global/swaps/testvm.img,xvda1,w''] > vif = [ > ''mac=00:16:3e:5b:8d:5d,bridge=xenbr0'', > ''mac=00:16:3e:99:9b:e7,bridge=xenbr1'' > ] > on_poweroff = ''destroy'' > on_reboot = ''restart'' > on_crash = ''restart'' > extra = ''2 console=hvc0 root=/dev/nfs ip=:192.168.1.12::::eth1:'' > nfs_server = ''192.168.1.12 <http://192.168.1.12>'' > nfs_root = ''/xen_vm/testvm'' > ------------------- > > > xend.log on source: > ------------------- > [2008-09-18 15:51:11 xend 3751] DEBUG (balloon:127) Balloon: > 786956 KiB free; need 2048; done. > [2008-09-18 15:51:11 xend 3751] DEBUG (XendCheckpoint:89) > [xc_save]: /usr/lib/xen/bin/xc_save 33 38 0 0 1 > ------------------- > > xend.log on destination: > ------------------- > ... > [2008-09-18 15:51:11 xend.XendDomainInfo 3331] DEBUG > (XendDomainInfo:1350) XendDomainInfo.construct: None > [2008-09-18 15:51:11 xend 3331] DEBUG (balloon:127) Balloon: > 262832 KiB free; need 2048; done. > ... > [2008-09-18 15:51:11 xend 3331] DEBUG (blkif:24) exception looking > up device number for xvda1: [Errno 2] No such file or directory: > ''/dev/xvda1'' > [2008-09-18 15:51:11 xend 3331] DEBUG (DevController:110) > DevController: writing {''backend-id'': ''0'', ''virtual-device'': > ''51713'', ''device-type'': ''disk'', ''state'': ''1'', ''backend'': > ''/local/domain/0/backend/tap/10/51713''} to > /local/domain/10/device/vbd/51713. > ... > [2008-09-18 15:51:12 xend 3331] DEBUG (XendCheckpoint:198) > restore:shadow=0x0, _static_max=0x100, _static_min=0x100, > [2008-09-18 15:51:12 xend 3331] DEBUG (balloon:127) Balloon: > 262832 KiB free; need 262144; done. > [2008-09-18 15:51:12 xend 3331] DEBUG (XendCheckpoint:215) > [xc_restore]: /usr/lib/xen/bin/xc_restore 24 10 1 2 0 0 0 > ------------------- > > > Xen version: xen-3.0-x86_32p > dom0: 2.6.18-92.1.10.el5xen > > Anybody know what would cause this, or have any suggestions for > tracking down the problem? I did find a post from someone who was > seeing identical behavior who claimed he fixed it by enabling CPU > Hotplug support, but I already have that enabled in the kernel. > > Thanks, > > Trevor > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com <mailto:Xen-users@lists.xensource.com> > http://lists.xensource.com/xen-users > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users