Cole, Ray
2005-Dec-07 16:20 UTC
RE: [Xen-users] live migration with xen 2.0.7 with fibre channel onDebian - help needed
I had this exact same problem with 2.0.7. I had done a little investigation and found scheduled_work gets called to schedule the shutdown in the user domain kernel, but the shutdown work that gets scheduled never actually gets called. I''m glad someone else is seeing this same problem now :-) Like you, it worked a number of times in a row, then would fail, and it didn''t seem to matter if there was really any load going on or not. -- Ray -----Original Message----- From: xen-users-bounces@lists.xensource.com [mailto:xen-users-bounces@lists.xensource.com]On Behalf Of Michael Mey Sent: Wednesday, December 07, 2005 9:00 AM To: xen-users@lists.xensource.com Subject: [Xen-users] live migration with xen 2.0.7 with fibre channel onDebian - help needed Hi, I''d like to test the stability of live migration during heavy load of domU. scenario: - both dom0s and domU are running on Debian Sarge. - script on dom0 triggers live-migration to the other dom0 - domU is running I/O tests, e.g. bonnie++ - domUs root- (ext3) and swap fs is stored on two partitions in a san - san is connected using fibre channel cards to both dom0s - san in dom0 works fine (tested with bonnie++ and own consistency test) observation: - migration works several times, usually something between 10 and 30 times - then something strange happens: A) either domU has completely disappeared on both dom0s xend.log on the target host of the last migration says: <snip> [2005-12-06 15:55:31 xend] INFO (XendRoot:113) EVENT> xend.console.create [14, 14, 9614] [2005-12-06 15:55:32 xend] INFO (XendRoot:113) EVENT> xend.domain.create [''debian1'', ''14''] [2005-12-06 15:56:02 xend] DEBUG (blkif:203) Connecting blkif to event channel <BlkifBackendInterface 14 0> ports=16:4 [2005-12-06 15:56:02 xend] DEBUG (XendDomain:244) XendDomain>reap> domain died name=debian1 id=14 [2005-12-06 15:56:02 xend] INFO (XendDomain:568) Destroying domain: name=debian1 </snip> xfrd.log on both dom0s says migration was successful OR B) domU is in paused-state on the target machine after migration, xend.log and xfrd.log seem to be ok on both dom0s domU _cannot_ be unpaused nor directly accessed using xm console xm vbd-destroy is working the only thing that can be done is xm destroy. The thing I am wondering about is why domU suddenly gets crashed after several successful migrations. Any help or ideas would be appreciated. Regards, Michael -- ---------------------------------------------------------------------------------------- Michael Mey Thinking Objects Software GmbH | mailto: michael.mey@to.com Lilienthalstrasse 2/1 | phone: +49 711 88770-147 70825 Stuttgart-Korntal, Germany | fax: +49 711 88770-449 ---------------------------------------------------------------------------------------- _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Steven Hand
2005-Dec-07 16:25 UTC
Re: [Xen-users] live migration with xen 2.0.7 with fibre channel onDebian - help needed
> I had this exact same problem with 2.0.7. I had done a little > investigation and found scheduled_work gets called to schedule the > shutdown in the user domain kernel, but the shutdown work that gets > scheduled never actually gets called. I''m glad someone else is > seeing this same problem now :-) Like you, it worked a number of > times in a row, then would fail, and it didn''t seem to matter if > there was really any load going on or not.At least one issue with live migration in 2.0.7 is fixed in the the 2.0-testing tree (cset 3513:80a8b005b669 if you want to just apply the patch directly) This doesn''t explain the situation where your domain dies tho; do you have any console information from the domain in question? cheers, S. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Michael Mey
2005-Dec-08 09:54 UTC
Re: [Xen-users] live migration with xen 2.0.7 with fibre channelonDebian - help needed
On Wednesday 07 December 2005 17:25, Steven Hand wrote:> > I had this exact same problem with 2.0.7. I had done a little > > investigation and found scheduled_work gets called to schedule the > > shutdown in the user domain kernel, but the shutdown work that gets > > scheduled never actually gets called. I''m glad someone else is > > seeing this same problem now :-) Like you, it worked a number of > > times in a row, then would fail, and it didn''t seem to matter if > > there was really any load going on or not. > > At least one issue with live migration in 2.0.7 is fixed in the > the 2.0-testing tree (cset 3513:80a8b005b669 if you want to just > apply the patch directly) > > This doesn''t explain the situation where your domain dies tho; do > you have any console information from the domain in question?The only thing I have is some syslog output of the domU, but there''s not more than this usual lines: ... Dec 6 15:47:16 debian1 kernel: Xen reported: 2806.425 MHz processor. Dec 6 15:51:12 debian1 kernel: Xen reported: 2806.429 MHz processor. Dec 6 15:54:54 debian1 kernel: Xen reported: 2806.425 MHz processor. ... and the log file of my i/o testing script, but these are only md5 checksums. The script writes several times a dummy text into a file on the hd and finally builds a md5 checksum of the file for integrity test. That logfile is fine, it''s always the same md5. Is there a possibility to further investigate what is happening with domU? The funny thing is, yesterday afternoon I started a new testrun, this time using a ramdisk as storage for the i/o test. That domU is still running (after approx. 350 migrations). So it seems to me as if there is a problem with storage during migration, which sometimes occurs earlier, sometimes later. The fibre channel adapters and the san is tested and works 100% correctly, so this shouldn''t be the reason. Regards, Michael -- ---------------------------------------------------------------------------------------- Michael Mey Thinking Objects Software GmbH | mailto: michael.mey@to.com Lilienthalstrasse 2/1 | phone: +49 711 88770-147 70825 Stuttgart-Korntal, Germany | fax: +49 711 88770-449 ---------------------------------------------------------------------------------------- _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi, A very basic question --Do i need to allocate a static IP for each of the domains I create ??? --Thanks Dakshina Michael Mey wrote:> On Wednesday 07 December 2005 17:25, Steven Hand wrote: > >>>I had this exact same problem with 2.0.7. I had done a little >>>investigation and found scheduled_work gets called to schedule the >>>shutdown in the user domain kernel, but the shutdown work that gets >>>scheduled never actually gets called. I''m glad someone else is >>>seeing this same problem now :-) Like you, it worked a number of >>>times in a row, then would fail, and it didn''t seem to matter if >>>there was really any load going on or not. >> >>At least one issue with live migration in 2.0.7 is fixed in the >>the 2.0-testing tree (cset 3513:80a8b005b669 if you want to just >>apply the patch directly) >> >>This doesn''t explain the situation where your domain dies tho; do >>you have any console information from the domain in question? > > The only thing I have is some syslog output of the domU, but there''s not more > than this usual lines: > ... > Dec 6 15:47:16 debian1 kernel: Xen reported: 2806.425 MHz processor. > Dec 6 15:51:12 debian1 kernel: Xen reported: 2806.429 MHz processor. > Dec 6 15:54:54 debian1 kernel: Xen reported: 2806.425 MHz processor. > ... > > and the log file of my i/o testing script, but these are only md5 checksums. > The script writes several times a dummy text into a file on the hd and > finally builds a md5 checksum of the file for integrity test. > That logfile is fine, it''s always the same md5. > > Is there a possibility to further investigate what is happening with domU? > > The funny thing is, yesterday afternoon I started a new testrun, this time > using a ramdisk as storage for the i/o test. That domU is still running > (after approx. 350 migrations). > So it seems to me as if there is a problem with storage during migration, > which sometimes occurs earlier, sometimes later. > > The fibre channel adapters and the san is tested and works 100% correctly, so > this shouldn''t be the reason. > > > Regards, > > Michael > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Michael Mey
2005-Dec-09 09:05 UTC
Re: [Xen-users] live migration with xen 2.0.7 with fibre channelonDebian - help needed
On Thursday 08 December 2005 10:54, Michael Mey wrote:> On Wednesday 07 December 2005 17:25, Steven Hand wrote: > > > I had this exact same problem with 2.0.7. I had done a little > > > investigation and found scheduled_work gets called to schedule the > > > shutdown in the user domain kernel, but the shutdown work that gets > > > scheduled never actually gets called. I''m glad someone else is > > > seeing this same problem now :-) Like you, it worked a number of > > > times in a row, then would fail, and it didn''t seem to matter if > > > there was really any load going on or not. > > > > At least one issue with live migration in 2.0.7 is fixed in the > > the 2.0-testing tree (cset 3513:80a8b005b669 if you want to just > > apply the patch directly) > > > > This doesn''t explain the situation where your domain dies tho; do > > you have any console information from the domain in question? > > The only thing I have is some syslog output of the domU, but there''s not > more than this usual lines: > ... > Dec 6 15:47:16 debian1 kernel: Xen reported: 2806.425 MHz processor. > Dec 6 15:51:12 debian1 kernel: Xen reported: 2806.429 MHz processor. > Dec 6 15:54:54 debian1 kernel: Xen reported: 2806.425 MHz processor. > ... > > and the log file of my i/o testing script, but these are only md5 > checksums. The script writes several times a dummy text into a file on the > hd and finally builds a md5 checksum of the file for integrity test. > That logfile is fine, it''s always the same md5. > > Is there a possibility to further investigate what is happening with domU? > > The funny thing is, yesterday afternoon I started a new testrun, this time > using a ramdisk as storage for the i/o test. That domU is still running > (after approx. 350 migrations). > So it seems to me as if there is a problem with storage during migration, > which sometimes occurs earlier, sometimes later. > > The fibre channel adapters and the san is tested and works 100% correctly, > so this shouldn''t be the reason.After getting the Xen 3.0 source tarball (release from monday) and compiling the kernels with support for my fibre channel cards, it finally seems to work (~400 migrations while domU''s i/o tests and it''s still running)!! :o) Cheers Michael -- ---------------------------------------------------------------------------------------- Michael Mey Thinking Objects Software GmbH | mailto: michael.mey@to.com Lilienthalstrasse 2/1 | phone: +49 711 88770-147 70825 Stuttgart-Korntal, Germany | fax: +49 711 88770-449 ---------------------------------------------------------------------------------------- _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Dakshina Dasari wrote:> Hi, > A very basic question --Do i need to allocate a static IP for each of > the domains I create ??? > --Thanks > Dakshina >Depends on your configuration. Assuming you use bridge, then treat every dom-U''s the way you treat other servers on that network. It might be assigning ip address thru dhcp or static. your choice. If you want to start a new thread then please remove old text. -- Fajar _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hello , i have a guest domain up (with FC4 xenU kernel 2.6.11.1-1226 ) and running but am unable to allocate a static IP address to it . I have referred to the Fedora Wiki QuickStart page and followed the steps to create the guest domain. Dhcp is working fine , but i need a static IP assigned to the system Is there some special parameter that must be configured ? Any particular kernel module thats to be loaded Also when i run the neat-tui tool do i specify the device to be configured as eth0 or the vif device ??? Help appreciated. Regards, Dakshina Fajar A. Nugraha wrote:> Dakshina Dasari wrote: > >> Hi, >> A very basic question --Do i need to allocate a static IP for each of >> the domains I create ??? >> --Thanks >> Dakshina >> > Depends on your configuration. > Assuming you use bridge, then treat every dom-U''s the way you treat > other servers on that network. It might be assigning ip address thru > dhcp or static. your choice. > > If you want to start a new thread then please remove old text. >-- _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users