Greetings: I''m experiencing some really strange behavior with an OpenSuse 10.3 guest running in Xen. Every 48-72 hours, the machine starts running at a very high load average, dumping tons of messages in the message log, finally becoming completly inaccessible. When the guest finally becomes unusable, the host "xm top" display shows 399% CPU utilzation, and contstant NET and VBD activity, but the host cannot even "shutdown" the guest - I have to destroy it to make it stop. The host machine is a Dell Poweredge 2950 III server, running OpenSuse 11.1, 64 bit, kernel 2.6.27.45-0.1-xen, and Xen package xen-3.3.1_18546_24-0.4.13 . It has 20GB of RAM, a quad-core 2GHz Intel CPU, and a Dell Perc5 RAID. It runs other guest machines with no problem. The guest machine is running OpenSuse 10.3, kernel 2.6.22.19-0.4-xenpae, in 32 bit mode, with Xen package xen-3.1.0_15042-51.3. The guest machine is a clone of a running phyical machine that I''m trying to virtualize. I did the creation of the drive, the attach, and so forth, on the Xen host, then I did an rsync of the 10.3 physical machine''s filesystems onto the 11.1 host. I removed and reinstalled the Xen kernel package as suggested on the net, and, against even my predictions, got the guest to boot. And it works great... for a few days or so. But, then, what happens is that the guest starts to go crazy. I see rapidly repeating messages like this start to appear in the syslog /var/log/messages: Nov 20 15:35:55 guestc kernel: b_state=0x00000029, b_size=4096 Nov 20 15:35:55 guestc kernel: device blocksize: 4096 Nov 20 15:35:55 guestc kernel: __find_get_block_slow() failed. block=210137505, b _blocknr=20676879 Occasionally these messages show up garbled, like this: Nov 20 15:35:55 guestc kernel: __find_get_block_slow() failed. block=21_f__f__f__ f___f__f_f_e_f_f____f_f_f_f_____f___f_f_____f__f__f_f___f__f__f_f__f__f____f__f_ f_f___f_f__f_____f__f__f__f__f_f_____f_f_f____f______f__f__f__f____f__f____f__f_ f__f___f__f___f__f__f__f_f_f__f__f____f__f____f__f___f___f__f_f___f__f__f_f_f__f _f___f___f__f__f__f_f___f___f__f__f___f__f_e_f__f_f__f__f__f______f__f______f__f __f__f_f___f_f___f_f_____f__f_f__f___f__f_f____f_f__f__f_f___f__f___f__f__f_f___ f__f_____f__f__f__f___state_f__f___f_f___f______f_fe___f___f_____f___f____f_____ f__f__f_f__f__f___f__f__f_____f______f__f____f_f___f_f_f____f___f__f___f____f__f __f____f__f_____f___f_f_____f__f_____f__f__f_f_f________f___f___f_f__f__f__f__f_ f_f_____f_f_f__e_f__f___f__f__f__f_f_f___f___f___f__f__f__state=0x000000__f__f_s tate=0x00000029, b_size=4096 And then, of course, I can''t even get in to the guest at all, via network or xm console. xm shutdown does nothing, and I must xm destroy the guest. After re-creating the guest, everything runs fine again, until another few days have passed. Today I was actually in the guest when this happened. An rsync was running, and that process was pegged, with the guest showing a load average of 5.0 from within the guest, and "xm top" showing a usage of 199% (2 of the 4 CPUS?) I couldn''t kill the rsync process, and the messages above were flooding into the syslog. The guest could not shut all the way down even with "init 0", and, eventually, I had to destroy it again. Here is the machine config: name="guestc" uuid="91919191-3676-3f68-bada-993e5adb1088" memory=8192 maxmem=8192 vcpus=4 on_poweroff="destroy" on_reboot="restart" on_crash="destroy" localtime=0 keymap="en-us" builder="linux" bootloader="/usr/lib/xen/boot/domUloader.py" bootargs="--entry=xvda2:/boot/vmlinuz-xenpae,/boot/initrd-xenpae" extra=" " disk=[ ''file:/a/disks/guestc/disk0,xvda,w'', ''phy:sdc1,sdc1,w'', ] vif=[ ''mac=00:16:3e:52:f9:96,bridge=br0'', ] vfb=[''type=vnc,vncunused=1''] Now, I get that I''m doing some unorthodox things here. Cloning a physical machine into a virtual machine. Running 10.3 as a guest under an 11.1 host. A 32-bit guest on a 64-bit host. But the thing DOES run, and I feel like I''m SO CLOSE to making this work, so I''m really hopeful that someone can recognize these symptoms and help me find a solution, rather than just pointing out the obviously edge-case aspects to this situation here. Any ideas or guidance would be greatly appreciated! Thank you! Glen Glen Barney _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Glen
2010-Nov-24 16:49 UTC
[Xen-devel] Begging for help with cloned OpenSuse 10.3 DomU on OpenSuse 11.1 Dom0
Dear Xen Developers: Please forgive me in advance, but I''ve exhausted every other option, read every manual I can find, and got no response on the Xen-Users list, so before I give up completely may I please beg for any insight any of you might have into a strange problem I''m experiencing... I''m experiencing some really strange behavior with an OpenSuse 10.3 guest running in Xen. Every 48-72 hours, the machine starts running at a very high load average, dumping tons of messages in the message log, finally becoming completly inaccessible. When the guest finally becomes unusable, the host "xm top" display shows 399% CPU utilzation, and contstant NET and VBD activity, but the host cannot even "shutdown" the guest - I have to destroy it to make it stop. The host machine is a Dell Poweredge 2950 III server, running OpenSuse 11.1, 64 bit, kernel 2.6.27.45-0.1-xen, and Xen package xen-3.3.1_18546_24-0.4.13 . It has 20GB of RAM, a quad-core 2GHz Intel CPU, and a Dell Perc5 RAID. It runs other guest machines with no problem. The guest machine is running OpenSuse 10.3, kernel 2.6.22.19-0.4-xenpae, in 32 bit mode, with Xen package xen-3.1.0_15042-51.3. The guest machine is a clone of a running phyical machine that I''m trying to virtualize. I did the creation of the drive, the attach, and so forth, on the Xen host, then I did an rsync of the 10.3 physical machine''s filesystems onto the 11.1 host. I removed and reinstalled the Xen kernel package as suggested on the net, and, against even my predictions, got the guest to boot. And it works great... for a few days or so. But, then, what happens is that the guest starts to go crazy. I see rapidly repeating messages like this start to appear in the syslog /var/log/messages: Nov 20 15:35:55 guestc kernel: b_state=0x00000029, b_size=4096 Nov 20 15:35:55 guestc kernel: device blocksize: 4096 Nov 20 15:35:55 guestc kernel: __find_get_block_slow() failed. block=210137505, b _blocknr=20676879 Occasionally these messages show up garbled, like this: Nov 20 15:35:55 guestc kernel: __find_get_block_slow() failed. block=21_f__f__f__ f___f__f_f_e_f_f____f_f_f_f_____f___f_f_____f__f__f_f___f__f__f_f__f__f____f__f_ f_f___f_f__f_____f__f__f__f__f_f_____f_f_f____f______f__f__f__f____f__f____f__f_ f__f___f__f___f__f__f__f_f_f__f__f____f__f____f__f___f___f__f_f___f__f__f_f_f__f _f___f___f__f__f__f_f___f___f__f__f___f__f_e_f__f_f__f__f__f______f__f______f__f __f__f_f___f_f___f_f_____f__f_f__f___f__f_f____f_f__f__f_f___f__f___f__f__f_f___ f__f_____f__f__f__f___state_f__f___f_f___f______f_fe___f___f_____f___f____f_____ f__f__f_f__f__f___f__f__f_____f______f__f____f_f___f_f_f____f___f__f___f____f__f __f____f__f_____f___f_f_____f__f_____f__f__f_f_f________f___f___f_f__f__f__f__f_ f_f_____f_f_f__e_f__f___f__f__f__f_f_f___f___f___f__f__f__state=0x000000__f__f_s tate=0x00000029, b_size=4096 And then, of course, I can''t even get in to the guest at all, via network or xm console. xm shutdown does nothing, and I must xm destroy the guest. After re-creating the guest, everything runs fine again, until another few days have passed. Today I was actually in the guest when this happened. An rsync was running, and that process was pegged, with the guest showing a load average of 5.0 from within the guest, and "xm top" showing a usage of 199% (2 of the 4 CPUS?) I couldn''t kill the rsync process, and the messages above were flooding into the syslog. The guest could not shut all the way down even with "init 0", and, eventually, I had to destroy it again. Here is the machine config: name="guestc" uuid="91919191-3676-3f68-bada-993e5adb1088" memory=8192 maxmem=8192 vcpus=4 on_poweroff="destroy" on_reboot="restart" on_crash="destroy" localtime=0 keymap="en-us" builder="linux" bootloader="/usr/lib/xen/boot/domUloader.py" bootargs="--entry=xvda2:/boot/vmlinuz-xenpae,/boot/initrd-xenpae" extra=" " disk=[ ''file:/a/disks/guestc/disk0,xvda,w'', ''phy:sdc1,sdc1,w'', ] vif=[ ''mac=00:16:3e:52:f9:96,bridge=br0'', ] vfb=[''type=vnc,vncunused=1''] Now, I get that I''m doing some unorthodox things here. Cloning a physical machine into a virtual machine. Running 10.3 as a guest under an 11.1 host. A 32-bit guest on a 64-bit host. But the thing DOES run, and I feel like I''m SO CLOSE to making this work, so I''m really hopeful that someone can recognize these symptoms and help me find a solution. Is there any way this can be made to work? Or am I totally out of luck? (Or just crazy to even try?) Any ideas or guidance would be greatly appreciated! Thank you! Glen Glen Barney _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Nov-24 17:04 UTC
Re: [Xen-devel] Begging for help with cloned OpenSuse 10.3 DomU on OpenSuse 11.1 Dom0
> Now, I get that I''m doing some unorthodox things here. Cloning a physical > machine into a virtual machine. Running 10.3 as a guest under an 11.1 host.What is the filesystem you have in your guest? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Glen
2010-Nov-24 17:15 UTC
Re: [Xen-devel] Begging for help with cloned OpenSuse 10.3 DomU on OpenSuse 11.1 Dom0
On 11/24/2010 9:04 AM, Konrad Rzeszutek Wilk wrote:>> Now, I get that I''m doing some unorthodox things here. Cloning a physical >> machine into a virtual machine. Running 10.3 as a guest under an 11.1 host. > What is the filesystem you have in your guest?Konrad - Thank you so much for replying! Both filesystems are EXT3. disk=[ ''file:/a/disks/ietfc103/disk0,xvda,w'', ''phy:sdc1,sdc1,w'', ] EXT3 FS on xvda2, internal journal EXT3-fs: mounted filesystem with ordered data mode. EXT3 FS on sdc1, internal journal EXT3-fs: mounted filesystem with ordered data mode. glen@guestc:~> mount /dev/xvda2 on / type ext3 (rw,acl,user_xattr) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) debugfs on /sys/kernel/debug type debugfs (rw) udev on /dev type tmpfs (rw) devpts on /dev/pts type devpts (rw,mode=0620,gid=5) /dev/sdc1 on /a type ext3 (rw,acl,user_xattr) securityfs on /sys/kernel/security type securityfs (rw) rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) nfsd on /proc/fs/nfsd type nfsd (rw) Glen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Nov-25 08:24 UTC
Re: [Xen-devel] Begging for help with cloned OpenSuse 10.3 DomU on OpenSuse 11.1 Dom0
>>> On 24.11.10 at 17:49, Glen <gb2@c5i.net> wrote: > The host machine is a Dell Poweredge 2950 III server, running OpenSuse 11.1, > 64 bit, kernel 2.6.27.45-0.1-xen, and Xen package > xen-3.3.1_18546_24-0.4.13 . > It has 20GB of RAM, a quad-core 2GHz Intel CPU, and a Dell Perc5 RAID. It > runs other guest machines with no problem. > > The guest machine is running OpenSuse 10.3, kernel 2.6.22.19-0.4-xenpae, in > 32 bit mode, with Xen package xen-3.1.0_15042-51.3.Why are you running software this old in both host and guest? I''m pretty certain neither is considered supported anymore, and chances are your problem would be solved by using newer bits. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2010-Nov-25 09:44 UTC
Re: [Xen-devel] Begging for help with cloned OpenSuse 10.3 DomU on OpenSuse 11.1 Dom0
Yes, the first thing you should do is upgrade to newer versions of Xen / kernel, if they''re available through OpenSuse. If this is the most recent version in OpenSuse, then reporting it there is probably the best thing to do. If you''ve tried the OpenSuse support channels and haven''t gotten any help there, you basically have two options: * Move to a different supported infrastructure. Xen Cloud Platform should be a good choice -- it''s designed from the ground up to be a virtualization applicance, it''s based on the XenServer product, and it''s actively supported by both a community and Citrix engineers. (There are probably other good options, I''m just not familiar enough with them to make a personal recommendation.) * Build your own versions of Xen and Linux from the most recent stable releases. Xen-users should be able to help you with that; and if you find bugs in those, this list will be much more willing / able to help you track them down. (This is assuming you want a fully open version. XenServer is a good product, and the free-as-in-beer version is very full-featured, and also has a decent support community.) Good luck, -George On Wed, Nov 24, 2010 at 4:49 PM, Glen <gb2@c5i.net> wrote:> Dear Xen Developers: > > Please forgive me in advance, but I''ve exhausted every other option, > read every manual I can find, and got no response on the Xen-Users list, > so before I give up completely may I please beg for any insight any of > you might have into a strange problem I''m experiencing... > > I''m experiencing some really strange behavior with an OpenSuse 10.3 guest > running in Xen. Every 48-72 hours, the machine starts running at a very > high load average, dumping tons of messages in the message log, finally > becoming completly inaccessible. When the guest finally becomes unusable, > the host "xm top" display shows 399% CPU utilzation, and contstant NET > and VBD activity, but the host cannot even "shutdown" the guest - I have > to destroy it to make it stop. > > The host machine is a Dell Poweredge 2950 III server, running OpenSuse 11.1, > 64 bit, kernel 2.6.27.45-0.1-xen, and Xen package > xen-3.3.1_18546_24-0.4.13 . > It has 20GB of RAM, a quad-core 2GHz Intel CPU, and a Dell Perc5 RAID. It > runs other guest machines with no problem. > > The guest machine is running OpenSuse 10.3, kernel 2.6.22.19-0.4-xenpae, in > 32 bit mode, with Xen package xen-3.1.0_15042-51.3. > > The guest machine is a clone of a running phyical machine that I''m > trying to > virtualize. I did the creation of the drive, the attach, and so forth, on > the Xen host, then I did an rsync of the 10.3 physical machine''s filesystems > onto the 11.1 host. I removed and reinstalled the Xen kernel package as > suggested on the net, and, against even my predictions, got the guest to > boot. And it works great... for a few days or so. > > But, then, what happens is that the guest starts to go crazy. I see rapidly > repeating messages like this start to appear in the syslog > /var/log/messages: > > Nov 20 15:35:55 guestc kernel: b_state=0x00000029, b_size=4096 > Nov 20 15:35:55 guestc kernel: device blocksize: 4096 > Nov 20 15:35:55 guestc kernel: __find_get_block_slow() failed. > block=210137505, b > _blocknr=20676879 > > Occasionally these messages show up garbled, like this: > > Nov 20 15:35:55 guestc kernel: __find_get_block_slow() failed. > block=21_f__f__f__ > f___f__f_f_e_f_f____f_f_f_f_____f___f_f_____f__f__f_f___f__f__f_f__f__f____f__f_ > f_f___f_f__f_____f__f__f__f__f_f_____f_f_f____f______f__f__f__f____f__f____f__f_ > f__f___f__f___f__f__f__f_f_f__f__f____f__f____f__f___f___f__f_f___f__f__f_f_f__f > _f___f___f__f__f__f_f___f___f__f__f___f__f_e_f__f_f__f__f__f______f__f______f__f > __f__f_f___f_f___f_f_____f__f_f__f___f__f_f____f_f__f__f_f___f__f___f__f__f_f___ > f__f_____f__f__f__f___state_f__f___f_f___f______f_fe___f___f_____f___f____f_____ > f__f__f_f__f__f___f__f__f_____f______f__f____f_f___f_f_f____f___f__f___f____f__f > __f____f__f_____f___f_f_____f__f_____f__f__f_f_f________f___f___f_f__f__f__f__f_ > f_f_____f_f_f__e_f__f___f__f__f__f_f_f___f___f___f__f__f__state=0x000000__f__f_s > tate=0x00000029, b_size=4096 > > And then, of course, I can''t even get in to the guest at all, via network > or xm console. xm shutdown does nothing, and I must xm destroy the guest. > > After re-creating the guest, everything runs fine again, until another few > days have passed. > > Today I was actually in the guest when this happened. An rsync was running, > and that process was pegged, with the guest showing a load average of 5.0 > from within the guest, and "xm top" showing a usage of 199% (2 of the 4 > CPUS?) > I couldn''t kill the rsync process, and the messages above were flooding into > the syslog. The guest could not shut all the way down even with "init 0", > and, eventually, I had to destroy it again. > > Here is the machine config: > > name="guestc" > uuid="91919191-3676-3f68-bada-993e5adb1088" > memory=8192 > maxmem=8192 > vcpus=4 > on_poweroff="destroy" > on_reboot="restart" > on_crash="destroy" > localtime=0 > keymap="en-us" > builder="linux" > bootloader="/usr/lib/xen/boot/domUloader.py" > bootargs="--entry=xvda2:/boot/vmlinuz-xenpae,/boot/initrd-xenpae" > extra=" " > disk=[ ''file:/a/disks/guestc/disk0,xvda,w'', ''phy:sdc1,sdc1,w'', ] > vif=[ ''mac=00:16:3e:52:f9:96,bridge=br0'', ] > vfb=[''type=vnc,vncunused=1''] > > Now, I get that I''m doing some unorthodox things here. Cloning a physical > machine into a virtual machine. Running 10.3 as a guest under an 11.1 host. > A 32-bit guest on a 64-bit host. But the thing DOES run, and I feel like > I''m SO CLOSE to making this work, so I''m really hopeful that someone can > recognize these symptoms and help me find a solution. > > Is there any way this can be made to work? Or am I totally out of luck? > (Or just crazy to even try?) > > Any ideas or guidance would be greatly appreciated! > > Thank you! > Glen > Glen Barney > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Glen
2010-Nov-25 17:50 UTC
Re: [Xen-devel] Begging for help with cloned OpenSuse 10.3 DomU on OpenSuse 11.1 Dom0
Jan, George, Thank you both for taking the time to reply to me! On 11/25/2010 12:24 AM, Jan Beulich wrote:> Why are you running software this old in both host and guest?I could explain it, but I have no desire to torture you. :-) And if I did explain it, the heads of half this list''s subscribers would explode. :-)> I''m pretty certain neither is considered supported anymore, and > chances are your problem would be solved by using newer bits.Thank you. I will start pursuing that direction today. On 11/25/2010 1:44 AM, George Dunlap wrote:> Yes, the first thing you should do is upgrade to newer versions of Xen > / kernel, if they''re available through OpenSuse. > If this is the most recent version in OpenSuse, then reporting it > there is probably the best thing to do.That sounds good, thanks. I will do so.> If you''ve tried the OpenSuse support channels and haven''t gotten any > help there, you basically have two options: > * Move to a different supported infrastructure. Xen Cloud Platform > should be a good choice -- it''s designed from the ground up to be a > virtualization applicance, it''s based on the XenServer product, and > it''s actively supported by both a community and Citrix engineers. > (There are probably other good options, I''m just not familiar enough > with them to make a personal recommendation.)No, I value your recommendation immensely, and am very grateful! As soon as I solve my immediate problem you can be sure I will check that out!> * Build your own versions of Xen and Linux from the most recent stable > releases. Xen-users should be able to help you with that; and if you > find bugs in those, this list will be much more willing / able to help > you track them down. > (This is assuming you want a fully open version. XenServer is a good > product, and the free-as-in-beer version is very full-featured, and > also has a decent support community.) > Good luck, > -GeorgeThank you - thank you both, very much. It will be even as you say. :-) Warmest regards, Glen Glen Barney _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel