Igor Galić
2012-Feb-21 22:46 UTC
[libvirt-users] libvirt doesn't boot kVM, hangs, qemu on 100% CPU
Hi folks, it's been adventurous. Yesterday night I've started debugging this particular issue of why my KVMs don't boot on Ubuntu 11.10. A first hint was apparmor, which seemed to deny access to the LVM partitions I had assigned as disks. After correctly configuring apparmor, or even disabling it the issue was still the same. After a wild goose chase, I decided to update everything, I now have * libvirt 0.9.8 (built from Ubuntu 12.04 sources) * qemu 1.0 (built from Ubuntu 12.04 sources) * SeaBIOS (built from git master: pre-1.6.4-20120221_203841) I even installed sgabios to see what's going on without having to connect with virt-manager every time. So what's going on? What am I trying to do? I'm trying to boot an Ubuntu amd64 VM, which looks pretty much the same as the Ubuntu host, except that I uninstalled libvirt and qemu-kvm, which felt out of place. This I'm trying to accomplish by booting it with the host's kernel and initrd (os/kernel and os/initrd). So much for the starting conditions. You can find my configuration here: http://sprunge.us/gCVj (I'm attaching it just in case the pastebin goes stale). The first thing that is slightly irritating is that it doesn't validate: """ Relax-NG validity error : Extra element kernel in interleave web.xml:9: element kernel: Relax-NG validity error : Element os failed to validate content web.xml:8: element type: Relax-NG validity error : Error validating value web.xml:8: element type: Relax-NG validity error : Element type failed to validate content web.xml:1: element domain: Relax-NG validity error : Invalid sequence in interleave web.xml:1: element domain: Relax-NG validity error : Element domain failed to validate content web.xml fails to validate """ Interestingly, even if I remove some of the elements that `virt-xml-validate` complains about via `virsh edit`, next time I `edit`, or `dumpxml` it'll be back in place. With sgabios in place, I can fortunately just c/p to show you what a boot looks like: """ SeaBIOS (version pre-1.6.4-20120221_203841-bacon) iPXE (http://ipxe.org) 00:03.0 C100 PCI2.10 PnP PMM+001C9B80+00189B80 C100 iPXE (http://ipxe.org) 00:04.0 C200 PCI2.10 PnP PMM 001C9B80 00189B80 C200 Booting from ROM... """ And that's it. And that's wrong. But I can't seem to figure out how and why. I'll be hugely in debt for your help and welcome any suggestions, ready to provide as much information as you need to solve this issue. So long, i -- Igor Gali? Tel: +43 (0) 664 886 22 883 Mail: i.galic at brainsware.org URL: http://brainsware.org/ GPG: 6880 4155 74BD FD7C B515 2EA5 4B1D 9E08 A097 C9AE -------------- next part -------------- A non-text attachment was scrubbed... Name: web.xml Type: application/xml Size: 2971 bytes Desc: not available URL: <http://listman.redhat.com/archives/libvirt-users/attachments/20120221/acc4428f/attachment.wsdl>
Eric Blake
2012-Feb-21 23:33 UTC
[libvirt-users] libvirt doesn't boot kVM, hangs, qemu on 100% CPU
On 02/21/2012 03:46 PM, Igor Gali? wrote:> So much for the starting conditions. You can find my > configuration here: http://sprunge.us/gCVj (I'm attaching > it just in case the pastebin goes stale). > > The first thing that is slightly irritating is that it > doesn't validate: > > """ > Relax-NG validity error : Extra element kernel in interleave > web.xml:9: element kernel: Relax-NG validity error : Element os failed to validate content > web.xml:8: element type: Relax-NG validity error : Error validating value > web.xml:8: element type: Relax-NG validity error : Element type failed to validate content > web.xml:1: element domain: Relax-NG validity error : Invalid sequence in interleave > web.xml:1: element domain: Relax-NG validity error : Element domain failed to validate content > web.xml fails to validateRight now, the RNG says you can have either <kernel> or <boot> under <os>, but not both. But the C code doesn't reject attempts to have both, and I have to wonder if we are running into problems by allowing both. And unfortunately, none of the tests/qemuxml2argvdata test files include any use of <kernel>, so we aren't exercising this part of XML parsing. Definitely some bugs to be fixed, but my problem is that I don't know what behavior should be legal. Is there ever a use case to combine both <kernel> and <boot> in the same image? Or are they really distinct (<kernel> says to boot using an image in the host, <boot> says to boot by using a kernel found in the guest device, whether that be a guest disk device or a guest network interface for PXE boot)? I'm hoping Dan has a bit more experience on this, given his work on libvirt-sandbox.> """ > > Interestingly, even if I remove some of the elements that > `virt-xml-validate` complains about via `virsh edit`, next > time I `edit`, or `dumpxml` it'll be back in place.Hmm - that makes it sound like once we have both methods in memory, we don't really have a way to remove one again. Also quite fishy. -- Eric Blake eblake at redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 620 bytes Desc: OpenPGP digital signature URL: <http://listman.redhat.com/archives/libvirt-users/attachments/20120221/75cce35b/attachment.sig>
Eric Blake
2012-Feb-22 23:42 UTC
[libvirt-users] RFC: <memory units=...> [was: libvirt doesn't boot kVM, hangs, qemu on 100% CPU]
On 02/21/2012 03:46 PM, Igor Gali? wrote:> > Hi folks, > > it's been adventurous. > Yesterday night I've started debugging this particular > issue of why my KVMs don't boot on Ubuntu 11.10.On IRC, we identified the culprit:> <uuid>8cfcb7b0-10ee-7d08-9b64-9f39c154292a</uuid> > <memory>2048</memory> > <currentMemory>2048</currentMemory>There ain't no way on earth you're going to boot a kernel in 2 megabytes of memory! I propose enhancing the XML; on output, libvirt should produce: <memory units='k'>2048</memory> => 2048 * kibibyte the output unit must remain the same as it has always been, but the new attribute will make it easier for humans reading the XML to spot blunders like what spawned this thread. On input, the optional attribute is more useful - we can use it to provide a multiplier (of course, the result will be rounded up to k, and again rounded up to any higher granularity per the hypervisor): b => bytes k, KiB => kibibyte (1024) KB => kilobyte (1000) M, MiB => mebibyte (1024*1024) MB => megabyte (1000000) and so on for at least G and T (do we need P, E, Z, or Y? and I'm jealous if you have a machine with 1Y memory). Thoughts before I propose such a patch? -- Eric Blake eblake at redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 620 bytes Desc: OpenPGP digital signature URL: <http://listman.redhat.com/archives/libvirt-users/attachments/20120222/bfe1475a/attachment.sig>
Igor Galić
2012-Feb-22 23:46 UTC
[libvirt-users] libvirt doesn't boot kVM, hangs, qemu on 100% CPU
Hey folks, After two painful days of debugging, I've finally found the answer to the pains. It lies in my misinterpretation of the unit of <memory> The unit is k, while I assumed m. The first thing someone helpfully suggested on IRC, when seeing <memory>20480</memory> was that 20G might be too much for my poor OS - so I shaved off a 0, going from 20M, which was already too little for booting a Linux kernel down to 2. Whelp. I've already sent an RFE to seabios@ asking if they would perhaps very much mind adding a printout of the amount of RAM available, like any other BIOS does. Let's see where that goes. Meanwhile I talked to Eric and convinced him that adding <memory units= would be a good thing. See his next mail in a different thread. Well so much for that. Thanks to everybody who helped me step through the stack. You've been a great crowd! So long, i -- Igor Gali? Tel: +43 (0) 664 886 22 883 Mail: i.galic at brainsware.org URL: http://brainsware.org/ GPG: 6880 4155 74BD FD7C B515 2EA5 4B1D 9E08 A097 C9AE
Possibly Parallel Threads
- [SeaBIOS] RFE: Print amount of RAM
- [LLVMdev] [PATCH] Building compiler-rt on Solaris
- Being strict on differentiating between IEC prefixes and SI prefixes.
- Re: Encrypting boot partition Libvirt not showing the OS booting up
- Inconsistent behavior between x86_64 and ppc64 when creating guests with NUMA node placement