Daniel P. Berrange
2007-Dec-19 05:38 UTC
[Xen-devel] PATCH 0/3: Direct linux kernel boot for HVM
Those who saw Chris Wright give my Xen summit presentation a few weeks back may remember one of the proposals was to support direct linux kernel boot for HVM guests. ie, the ability to have a guest boot off a kernel, initrd, and args instead of merely the BIOS supported disk/cdrom/network. This capability is useful for provisioning of new guests because you get the ability to pass information directly to the installer program via the kernel boot args. For example, with Anaconda you can pass in the URL for a kickstart file, and thus get a 100% automated / unattended install without needing to build custom ISO images or setup PXE. QEMU / KVM have always had this ability for HVM guests, so I figured it ought to be possible to make it work in Xen too, except the neccessary code was #ifdef''d out in the ioemu copy of QEMU. After some poking it became clear why this was. The Linux kernel wants its protected-mode image to live at 0x100000 and starts executing at this addr immediately when switching from real to protected mode. A Xen guest also starts executing at 0x100000, and the HVM guest firmware lives at this address. They obviously can''t both live there. Hence why direct kernel boot is currently not supported on Xen. The answer is to move the Linux kernel elsewhere....more on that in later patches The following 3 patches implement all this. They have been tested against xen-unstable changeset 16606, on i386 only. I have not tested them on x86_64 yet, hence I am NOT requesting commit to xen-unstable yet. This is just a posting for code review... Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel P. Berrange
2007-Dec-19 05:41 UTC
Re: [Xen-devel] PATCH 1/3: XenD changes for HVM kernel boot
This patch provides the tools support for direct kernel boot of HVM guests. Currently the config files in /etc/xen support the args ''kernel'', ''ramdisk'' and ''extra''. For PV guests these have the obvious meaning. Unfortunately HVM guest configs hijacked the ''kernel'' parameter and use it to refer to the path of the HVM firmware. So, this patch adds a new config file parameter called ''loader'' which is used to refer to the HVM firmware instead. The conventions for loading the initrd image say that it should live at the end of memory. This requires QEMU to know the size of the guest''s initial RAM allocation, so image.py is changed to pass the ''-m'' flag to QEMU. The HVMImageHandler class in image.py is changed so that if the ''kernel'', ''ramdisk'' or ''extra'' params were given in the config these are passed to QEMU with the ''-kernel'', ''-initrd'' and ''-append'' flags respectively. Finally, the ''loader'' param is used as the arg to ''xc_hvm_build'' instead of the old ''kernel'' param. For the sake of compatability with old HVM guest config files, if the config file has a ''kernel'' param whose path matches that of the HVM firmware, then we automatically convert this ''kernel'' param into the ''loader'' param. This ensures existing HVM guests work without changes required. For the purposes of testing, my guest looks like this: name = "hvmdemo" builder = "hvm" memory = "500" disk = [ "file:/var/lib/xen/images/hvmdemo.img,hda,w" ] uuid = "0a696059-d2e8-2691-86e7-1daeed939649" device_model = "/usr/lib/xen/bin/qemu-dm" loader = "/usr/lib/xen/boot/hvmloader" kernel = "/root/install/vmlinuz-f8-i386" ramdisk = "/root/install/initrd.img-f8-i386" extra = "console=ttyS0 console=tty0" serial = "file:/tmp/hvmdemo.log" vnc=1 vncunused=1 apic=0 acpi=0 pae=0 Note, here we demonstrate a useful advantage of direct kernel boot by telling the guest kernel to send its output to the first serial device, which we then connect to a file for logging. xend/XendConfig.py | 13 ++++++++++++- xend/image.py | 24 +++++++++++++++++++----- xm/create.py | 6 ++++++ 3 files changed, 37 insertions(+), 6 deletions(-) Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel P. Berrange
2007-Dec-19 05:42 UTC
Re: [Xen-devel] PATCH 2/3: Support booting relocatable kernels
This patch introduces the basic infrastructure for direct kernel boot in the ioemu copy of QEMU. The current #ifdef disabled code is actually obsolete wrt to upstream QEMU code. So this is removed entirely. In its place I have imported the latest upstream QEMU code. The QEMU code assumes that the guest RAM is directly mapped into the QEMU process, so there were some changes neccessary. Instead of strcpy/memcpy''ing the args and kernel header into guest RAM, cpu_physical_memory_rw is used. Intead of fread() the initrd and kernel into guest RAM a helper function is used fread2guest which reads into a small buffer and then uses cpu_physical_memory_rw. NB in reading the following, Documentation/i386/boot.txt is a useful reference for what''s going on. Next, instead of loading the kernel at 0x100000, this code loads it at 0x200000. This is far enough away that there''s no risk of it overlapping with the HVM firmware image. If the Linux kernel boot protocol is 0x205 or later, and the flag at offset 0x234 in the kernel header is 1, then the guest kernel was built with CONFIG_RELOCATABLE=y. In this scenario we merely need to tell the kernel what address it has been relocated to by writing 0x200000 into the kernel header at offset 0x214. When switching from real mode into protected mode the kernel will immediately start executing at 0x200000 and be happy with life. This should work for 2.6.20 or later on i386, and 2.6.22 or later on x86_64. This has been verified with Fedora 7 and Fedora 8 bare metal kernels on i386 from the $TREE/images/pxeboot of the install trees. NB x86_64 is not yet tested pc.c | 352 ++++++++++++++++++++++++++++++++++++++++++++++++++----------------- 1 file changed, 265 insertions(+), 87 deletions(-) Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel P. Berrange
2007-Dec-19 05:43 UTC
Re: [Xen-devel] PATCH 3/3: Support boot of NON-relocatable kernels
This patch introduces a hack to make non-relocatable kernels bootable too. Non-relocatable kernels absolutely want to run at 0x100000 and are not at all happy about being at 0x200000. Fortunately, thanks to crazy programs like LOADLIN, Linux has a couple of hooks in its boot process which can be used to play games. The ''code32_switch'' hook is executed immediately following the switch to protected mode. To quote the kernel docs [quote Documentation/i386/boot.txt] code32_start: A 32-bit flat-mode routine *jumped* to immediately after the transition to protected mode, but before the kernel is uncompressed. No segments, except CS, are set up; you should set them up to KERNEL_DS (0x18) yourself. After completing your hook, you should jump to the address that was in this field before your boot loader overwrote it. IMPORTANT: All the hooks are required to preserve %esp, %ebp, %esi and %edi across invocation. [/quote] So, this patch installs a hook at 0x200000+kernel_size. The hook is hand crafted assembly which sets up all the segments as needed, then essentially does memmove(0x100000,0x200000,kernel_size) and finally does an unconditional jmp to 0x100000. Amazingly this actually really does work. It has been successfully tested with RHEL-2.1 and Fedora Core 6 install kernels on i386. NB x86_64 is not yet tested pc.c | 95 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 93 insertions(+), 2 deletions(-) Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel P. Berrange
2007-Dec-19 05:44 UTC
Re: [Xen-devel] PATCH 2/3: Support booting relocatable kernels
And with the patch attached this time.... On Wed, Dec 19, 2007 at 05:42:41AM +0000, Daniel P. Berrange wrote:> This patch introduces the basic infrastructure for direct kernel > boot in the ioemu copy of QEMU. The current #ifdef disabled > code is actually obsolete wrt to upstream QEMU code. So this > is removed entirely. In its place I have imported the latest > upstream QEMU code. The QEMU code assumes that the guest RAM > is directly mapped into the QEMU process, so there were some > changes neccessary. Instead of strcpy/memcpy''ing the args > and kernel header into guest RAM, cpu_physical_memory_rw is > used. Intead of fread() the initrd and kernel into guest RAM > a helper function is used fread2guest which reads into a small > buffer and then uses cpu_physical_memory_rw. > > NB in reading the following, Documentation/i386/boot.txt is > a useful reference for what''s going on. > > Next, instead of loading the kernel at 0x100000, this code > loads it at 0x200000. This is far enough away that there''s > no risk of it overlapping with the HVM firmware image. If the > Linux kernel boot protocol is 0x205 or later, and the flag > at offset 0x234 in the kernel header is 1, then the guest > kernel was built with CONFIG_RELOCATABLE=y. > > In this scenario we merely need to tell the kernel what address > it has been relocated to by writing 0x200000 into the kernel > header at offset 0x214. When switching from real mode into > protected mode the kernel will immediately start executing at > 0x200000 and be happy with life. This should work for 2.6.20 or > later on i386, and 2.6.22 or later on x86_64. > > This has been verified with Fedora 7 and Fedora 8 bare metal kernels > on i386 from the $TREE/images/pxeboot of the install trees. > > NB x86_64 is not yet tested > > pc.c | 352 ++++++++++++++++++++++++++++++++++++++++++++++++++----------------- > 1 file changed, 265 insertions(+), 87 deletions(-) > > > Signed-off-by: Daniel P. Berrange <berrange@redhat.com> > > Regards, > Dan. > -- > |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| > |=- Perl modules: http://search.cpan.org/~danberr/ -=| > |=- Projects: http://freshmeat.net/~danielpb/ -=| > |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=| > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Dec-19 10:48 UTC
Re: [Xen-devel] PATCH 0/3: Direct linux kernel boot for HVM
On 19/12/07 05:38, "Daniel P. Berrange" <berrange@redhat.com> wrote:> Those who saw Chris Wright give my Xen summit presentation a few weeks back > may remember one of the proposals was to support direct linux kernel boot > for HVM guests. ie, the ability to have a guest boot off a kernel, initrd, > and args instead of merely the BIOS supported disk/cdrom/network.Is the aim to be able to boot from any device that can be plumbed to qemu, or to allow kernel/initrd parms to be specified from outside the HVM bootloader config file? If the former, Anthony Liguori''s recent patch to implement an option ROM to allow easy cut-thru to any qemu backend device looks like an easier way to achieve that aim. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel P. Berrange
2007-Dec-19 13:00 UTC
Re: [Xen-devel] PATCH 0/3: Direct linux kernel boot for HVM
On Wed, Dec 19, 2007 at 10:48:13AM +0000, Keir Fraser wrote:> On 19/12/07 05:38, "Daniel P. Berrange" <berrange@redhat.com> wrote: > > > Those who saw Chris Wright give my Xen summit presentation a few weeks back > > may remember one of the proposals was to support direct linux kernel boot > > for HVM guests. ie, the ability to have a guest boot off a kernel, initrd, > > and args instead of merely the BIOS supported disk/cdrom/network. > > Is the aim to be able to boot from any device that can be plumbed to qemu, > or to allow kernel/initrd parms to be specified from outside the HVM > bootloader config file? If the former, Anthony Liguori''s recent patch to > implement an option ROM to allow easy cut-thru to any qemu backend device > looks like an easier way to achieve that aim.It is the latter - just to allow boot from external kernl/initrd/params. This is basically making qemu-dm -kernel /path/to/vmlinux -initrd /path/initrd -append ''someargs'' work in Xen the same way it works in current upstream QEMU. Currently this hooks in QEMU''s boot by writing a fake MBR into the first harddisk. Once anthony''s option ROM patch is accepted upstream they could be leveraged to get rid of the MBR hook (indeed anthony''s patches include this optimization). This is only a very minor part of the code though, so the bulk of this patch would still remain. Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Dec-19 14:43 UTC
Re: [Xen-devel] PATCH 0/3: Direct linux kernel boot for HVM
On 19/12/07 13:00, "Daniel P. Berrange" <berrange@redhat.com> wrote:> It is the latter - just to allow boot from external kernl/initrd/params. > This is basically making > > qemu-dm -kernel /path/to/vmlinux -initrd /path/initrd -append ''someargs'' > > work in Xen the same way it works in current upstream QEMU. > > Currently this hooks in QEMU''s boot by writing a fake MBR into the first > harddisk. Once anthony''s option ROM patch is accepted upstream they could > be leveraged to get rid of the MBR hook (indeed anthony''s patches include > this optimization). This is only a very minor part of the code though, so > the bulk of this patch would still remain.The patches look fine to me in principle, especially if a lot of the code is in upstream qemu anyway. I''ll do a deeper dive after 3.2.0. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel