Jeremy Fitzhardinge
2007-Jun-20 16:55 UTC
[PATCH 5/9] add WEAK() for creating weak asm labels
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> --- include/linux/linkage.h | 6 ++++++ 1 file changed, 6 insertions(+) ==================================================================--- a/include/linux/linkage.h +++ b/include/linux/linkage.h @@ -34,6 +34,12 @@ name: #endif +#ifndef WEAK +#define WEAK(name) \ + .weak name; \ + name: +#endif + #define KPROBE_ENTRY(name) \ .pushsection .kprobes.text, "ax"; \ ENTRY(name) --
Jeremy Fitzhardinge
2007-Jun-20 16:55 UTC
[PATCH 2/9] define ELF notes for adding to a boot image
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Vivek Goyal <vgoyal@in.ibm.com> --- include/linux/elf_boot.h | 15 +++++++++++++++ 1 file changed, 15 insertions(+) ==================================================================--- /dev/null +++ b/include/linux/elf_boot.h @@ -0,0 +1,15 @@ +#ifndef ELF_BOOT_H +#define ELF_BOOT_H + +/* Elf notes to help bootloaders identify what program they are booting. + */ + +/* Standardized Elf image notes for booting... The name for all of these is ELFBoot */ +#define ELF_NOTE_BOOT "ELFBoot" + +#define EIN_PROGRAM_NAME 1 /* The program in this ELF file */ +#define EIN_PROGRAM_VERSION 2 /* The version of the program in this ELF file */ +#define EIN_PROGRAM_CHECKSUM 3 /* ip style checksum of the memory image. */ +#define EIN_ARGUMENT_STYLE 4 /* String identifying argument passing style */ + +#endif /* ELF_BOOT_H */ --
This patch makes .note segments always allocated; that is, they are loaded as part of the binary and appear in the :data segment. This is not always necessary, but certain users - such as vsyscalls and notes in boot images - require the notes to be allocated. Rather than having two ways of creating notes, just have one which suits everyone. The only downside is that the notes will actually consume space at runtime. This isn't a big deal, since a typical kernel doesn't have very many, if any. Also, make the ELFNOTE() macro do the right thing in 32/64 bit environments. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> --- arch/i386/kernel/vmlinux.lds.S | 9 ++++----- arch/i386/kernel/vsyscall-note.S | 4 +--- include/asm-generic/vmlinux.lds.h | 7 ++++++- include/linux/elfnote.h | 2 +- 4 files changed, 12 insertions(+), 10 deletions(-) ==================================================================--- a/arch/i386/kernel/vmlinux.lds.S +++ b/arch/i386/kernel/vmlinux.lds.S @@ -28,10 +28,9 @@ jiffies = jiffies_64; jiffies = jiffies_64; PHDRS { - text PT_LOAD FLAGS(5); /* R_E */ - data PT_LOAD FLAGS(7); /* RWE */ - note PT_NOTE FLAGS(0); /* ___ */ + STD_PHDRS } + SECTIONS { . = LOAD_OFFSET + LOAD_PHYSICAL_ADDR; @@ -72,6 +71,8 @@ SECTIONS _sdata = .; /* End of text section */ RODATA + + NOTES /* writeable */ . = ALIGN(4096); @@ -211,6 +212,4 @@ SECTIONS STABS_DEBUG DWARF_DEBUG - - NOTES } ==================================================================--- a/arch/i386/kernel/vsyscall-note.S +++ b/arch/i386/kernel/vsyscall-note.S @@ -9,9 +9,7 @@ /* Ideally this would use UTS_NAME, but using a quoted string here doesn't work. Remember to change this when changing the kernel's name. */ -ELFNOTE_START(Linux, 0, "a") - .long LINUX_VERSION_CODE -ELFNOTE_END +ELFNOTE(Linux, 0, .long LINUX_VERSION_CODE) #ifdef CONFIG_XEN ==================================================================--- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -245,8 +245,13 @@ __stop___bug_table = .; \ } +#define STD_PHDRS \ + text PT_LOAD FILEHDR PHDRS FLAGS(5); /* R_E */ \ + data PT_LOAD FLAGS(7); /* RWE */ \ + note PT_NOTE FLAGS(0); /* ___ */ + #define NOTES \ - .notes : { *(.note.*) } :note + .notes : { *(.note.*) } :data :note #define INITCALLS \ *(.initcall0.init) \ ==================================================================--- a/include/linux/elfnote.h +++ b/include/linux/elfnote.h @@ -53,7 +53,7 @@ 4484:.balign 4 ; \ .popsection ; #define ELFNOTE(name, type, desc) \ - ELFNOTE_START(name, type, "") \ + ELFNOTE_START(name, type, "a") \ desc ; \ ELFNOTE_END --
Jeremy Fitzhardinge
2007-Jun-20 16:56 UTC
[PATCH 4/9] i386: make the bzImage payload an ELF file
This patch makes the payload of the bzImage file an ELF file. In other words, the bzImage is structured as follows: - boot sector - 16bit setup code - ELF header - decompressor - compressed kernel A bootloader may find the start of the ELF file by looking at the setup_size entry in the boot params, and using that to find the offset of the ELF header. The ELF Phdrs contain all the mapped memory required to decompress and start booting the kernel. One slightly complex part of this is that the bzImage boot_params need to know about the internal structure of the ELF file, at least to the extent of being able to point the core32_start entry at the ELF file's entrypoint, so that loaders which use this field will still work. Similarly, the ELF header needs to know how big the kernel vmlinux's bss segment is, in order to make sure is is mapped properly. To handle these two cases, we generate abstracted versions of the object files which only contain the symbols we care about (generated with objcopy --strip-all --keep-symbol=X), and then include those symbol tables with ld -R. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Vivek Goyal <vgoyal@in.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> --- arch/i386/boot/Makefile | 11 +++-- arch/i386/boot/compressed/Makefile | 18 ++++++-- arch/i386/boot/compressed/elfhdr.c | 65 +++++++++++++++++++++++++++++++ arch/i386/boot/compressed/head.S | 9 ++-- arch/i386/boot/compressed/notes.c | 7 +++ arch/i386/boot/compressed/piggy.S | 5 +- arch/i386/boot/compressed/vmlinux.lds | 29 +++++++++++-- arch/i386/boot/header.S | 14 ++---- arch/i386/boot/setup.ld | 5 +- arch/i386/kernel/asm-offsets.c | 2 arch/i386/kernel/head.S | 26 ++++++++++-- arch/i386/kernel/vmlinux.lds.S | 4 + arch/x86_64/boot/compressed/Makefile | 22 ++++++++-- arch/x86_64/boot/compressed/elfhdr.c | 1 arch/x86_64/boot/compressed/head.S | 12 ++--- arch/x86_64/boot/compressed/vmlinux.lds | 21 +++++++++- arch/x86_64/kernel/vmlinux.lds.S | 3 + include/asm-x86_64/elf-defines.h | 5 ++ include/asm-x86_64/elf.h | 2 scripts/Makefile.lib | 11 +++++ 20 files changed, 226 insertions(+), 46 deletions(-) ==================================================================--- a/arch/i386/boot/Makefile +++ b/arch/i386/boot/Makefile @@ -72,14 +72,19 @@ AFLAGS := $(CFLAGS) -D__ASSEMBLY__ SETUP_OBJS = $(addprefix $(obj)/,$(setup-y)) -LDFLAGS_setup.elf := -T -$(obj)/setup.elf: $(src)/setup.ld $(SETUP_OBJS) FORCE +$(obj)/zImage $(obj)/bzImage: \ + LDFLAGS := \ + -R $(obj)/compressed/blob-syms \ + --defsym IMAGE_OFFSET=$(IMAGE_OFFSET) -T + +$(obj)/setup.elf: $(src)/setup.ld $(SETUP_OBJS) \ + $(obj)/compressed/blob-syms FORCE $(call if_changed,ld) $(obj)/payload.o: EXTRA_AFLAGS := -Wa,-I$(obj) $(obj)/payload.o: $(src)/payload.S $(obj)/blob.bin -$(obj)/compressed/blob: FORCE +$(obj)/compressed/blob $(obj)/compressed/blob-syms: FORCE $(Q)$(MAKE) $(build)=$(obj)/compressed IMAGE_OFFSET=$(IMAGE_OFFSET) $@ # Set this if you want to pass append arguments to the zdisk/fdimage/isoimage kernel ==================================================================--- a/arch/i386/boot/compressed/Makefile +++ b/arch/i386/boot/compressed/Makefile @@ -4,21 +4,31 @@ # create a compressed vmlinux image from the original vmlinux # -targets := blob vmlinux.bin vmlinux.bin.gz head.o misc.o piggy.o \ +targets := blob vmlinux.bin vmlinux.bin.gz \ + elfhdr.o head.o misc.o notes.o piggy.o \ vmlinux.bin.all vmlinux.relocs -LDFLAGS_blob := -T hostprogs-y := relocs CFLAGS := -m32 -D__KERNEL__ $(LINUX_INCLUDE) -O2 \ -fno-strict-aliasing -fPIC \ $(call cc-option,-ffreestanding) \ $(call cc-option,-fno-stack-protector) -LDFLAGS := -m elf_i386 +LDFLAGS := -R $(obj)/vmlinux-syms --defsym IMAGE_OFFSET=$(IMAGE_OFFSET) -T -$(obj)/blob: $(src)/vmlinux.lds $(obj)/head.o $(obj)/misc.o $(obj)/piggy.o FORCE +OBJS=$(addprefix $(obj)/,elfhdr.o head.o misc.o notes.o piggy.o) + +$(obj)/blob: $(src)/vmlinux.lds $(obj)/vmlinux-syms $(OBJS) FORCE $(call if_changed,ld) @: + +EXTRACTSYMS_blob-syms := blob_entry_32 +$(obj)/blob-syms: $(obj)/blob FORCE + $(call if_changed,symextract) + +EXTRACTSYMS_vmlinux-syms := __kernel_end __kernel_data_size +$(obj)/vmlinux-syms: vmlinux FORCE + $(call if_changed,symextract) $(obj)/vmlinux.bin: vmlinux FORCE $(call if_changed,objcopy) ==================================================================--- /dev/null +++ b/arch/i386/boot/compressed/elfhdr.c @@ -0,0 +1,65 @@ +#include <linux/elf-defn.h> +#include <asm/boot.h> +#include <asm/page.h> +#include <asm/elf-defines.h> + +typedef char ld_sym_t[]; + +extern ld_sym_t blob_filesz, blob_memsz, + blob_notesz, _notes, blob_entry_32, blob_entry_64; + +#define LDSYM(x) ((unsigned long)x) + +#define PHDR(type, offset, vaddr, paddr, filesz, memsz, flags, align) \ + { \ + .p_type = type, .p_offset = offset, \ + .p_vaddr = vaddr, .p_paddr = paddr, \ + .p_filesz = filesz, .p_memsz = memsz, \ + .p_flags = flags, .p_align = align, \ + } + +static Elf_Phdr phdr[] +__attribute__((section(".elf.phdr"), used)) = { + PHDR(PT_LOAD, 0, LOAD_PHYSICAL_ADDR, LOAD_PHYSICAL_ADDR, + LDSYM(blob_filesz), LDSYM(blob_memsz), + PF_R | PF_W | PF_X, + PAGE_SIZE), + PHDR(PT_NOTE, LDSYM(_notes), 0, 0, LDSYM(blob_notesz), 0, 0, 0), +}; + +static Elf_Ehdr ehdr +__attribute__((section(".elf.ehdr"), used)) = { + .e_ident = { [EI_MAG0] = ELFMAG0, + [EI_MAG1] = ELFMAG1, + [EI_MAG2] = ELFMAG2, + [EI_MAG3] = ELFMAG3, + + [EI_CLASS] = ELF_CLASS, + [EI_DATA] = ELF_DATA, + [EI_VERSION] = EV_CURRENT, + [EI_OSABI] = ELFOSABI_STANDALONE, + }, +#ifdef CONFIG_RELOCATABLE + .e_type = ET_DYN, +#else + .e_type = ET_EXEC, +#endif + .e_machine = ELF_ARCH, + .e_version = 1, +#if ELF_CLASS == ELFCLASS32 + .e_entry = LDSYM(blob_entry_32), +#elif ELF_CLASS == ELFCLASS64 + .e_entry = LDSYM(blob_entry_64), +#else +#warning ELF_CLASS not set +#endif + .e_phoff = (unsigned long)phdr, + .e_shoff = 0, + .e_flags = 0, + .e_ehsize = sizeof(Elf_Ehdr), + .e_phentsize = sizeof(Elf_Phdr), + .e_phnum = sizeof(phdr)/sizeof(*phdr), + .e_shentsize = sizeof(Elf_Shdr), + .e_shnum = 0, + .e_shstrndx = 0, +}; ==================================================================--- a/arch/i386/boot/compressed/head.S +++ b/arch/i386/boot/compressed/head.S @@ -27,11 +27,12 @@ #include <asm/segment.h> #include <asm/page.h> #include <asm/boot.h> +#include <asm/asm-offsets.h> .section ".text.head","ax",@progbits - .globl startup_32 + .globl blob_startup_32 -startup_32: +blob_startup_32: cld cli movl $(__BOOT_DS),%eax @@ -48,7 +49,7 @@ startup_32: * data at 0x1e4 (defined as a scratch field) are used as the stack * for this calculation. Only 4 bytes are needed. */ - leal (0x1e4+4)(%esi), %esp + leal (BP_scratch+4)(%esi), %esp call 1f 1: popl %ebp subl $1b, %ebp @@ -85,7 +86,7 @@ 1: popl %ebp pushl %esi leal _end(%ebp), %esi leal _end(%ebx), %edi - movl $(_end - startup_32), %ecx + movl $(_end - blob_startup_32), %ecx std rep movsb ==================================================================--- /dev/null +++ b/arch/i386/boot/compressed/notes.c @@ -0,0 +1,7 @@ +#include <linux/elfnote.h> +#include <linux/elf_boot.h> +#include <linux/utsrelease.h> + +ELFNOTE(ELF_NOTE_BOOT, EIN_PROGRAM_NAME, "Linux"); +ELFNOTE(ELF_NOTE_BOOT, EIN_PROGRAM_VERSION, UTS_RELEASE); +ELFNOTE(ELF_NOTE_BOOT, EIN_ARGUMENT_STYLE, "Linux"); ==================================================================--- a/arch/i386/boot/compressed/piggy.S +++ b/arch/i386/boot/compressed/piggy.S @@ -1,8 +1,9 @@ .section .data.compressed,"a",@progbits -.globl input_data, input_len, output_len +.globl input_data, input_len, input_size, output_len -input_len: .long input_data_end - input_data +input_size = input_data_end - input_data +input_len: .long input_size input_data: .incbin "vmlinux.bin.gz" ==================================================================--- a/arch/i386/boot/compressed/vmlinux.lds +++ b/arch/i386/boot/compressed/vmlinux.lds @@ -1,18 +1,27 @@ OUTPUT_FORMAT("elf32-i386") OUTPUT_FORMAT("elf32-i386") OUTPUT_ARCH(i386) -ENTRY(startup_32) + SECTIONS { - /* Be careful parts of head.S assume startup_32 is at - * address 0. - */ + /* make sure we don't get anything from vmlinux-syms */ + /DISCARD/ : { */vmlinux-syms(*) } + . = 0 ; .text.head : { + *(.elf.ehdr) + *(.elf.phdr) + _notes = . ; + *(.note*) + _notes_end = .; + _head = . ; + blob_entry_32 = blob_startup_32 + IMAGE_OFFSET; *(.text.head) _ehead = . ; } + blob_notesz = _notes_end - _notes; .data.compressed : { + blob_payload = input_data + IMAGE_OFFSET; *(.data.compressed) } .text : { @@ -33,6 +42,8 @@ SECTIONS *(.data.*) _edata = . ; } + + blob_filesz = . ; .bss : { _bss = . ; *(.bss) @@ -41,4 +52,14 @@ SECTIONS . = ALIGN(8); _end = . ; } + + /* How much memory we need for decompression: */ + blob_needs = . - IMAGE_OFFSET + /* compressed data + decompressor */ + __kernel_data_size + /* uncompressed data */ + (__kernel_data_size / 0x8000 * 8) + 0x8000 + 18; /* overhead */ + + /* Memory we need to reserve in PHDR: + max of our needs and kernel's needs */ + blob_memsz = blob_needs > __kernel_end ? blob_needs : __kernel_end; + } ==================================================================--- a/arch/i386/boot/header.S +++ b/arch/i386/boot/header.S @@ -148,20 +148,16 @@ CAN_USE_HEAP = 0x80 # If set, the load .byte LOADED_HIGH #endif -setup_move_size: .word _setup_size # size to move, when setup is not +setup_move_size: .word 0x8000 # size to move, when setup is not # loaded at 0x90000. We will move setup # to 0x90000 then just before jumping # into the kernel. However, only the # loader knows how much data behind - # us also needs to be loaded. - -code32_start: # here loaders can put a different + # us also needs to be loaded. Needs to + # default to 0x8000 for old bootloaders. + +code32_start: .long blob_entry_32 # here loaders can put a different # start address for 32-bit code. -#ifndef __BIG_KERNEL__ - .long 0x1000 # 0x1000 = default for zImage -#else - .long 0x100000 # 0x100000 = default for big kernel -#endif ramdisk_image: .long 0 # address of loaded ramdisk image # Here the loader puts the 32-bit ==================================================================--- a/arch/i386/boot/setup.ld +++ b/arch/i386/boot/setup.ld @@ -9,6 +9,9 @@ ENTRY(_start) SECTIONS { + /* make sure we don't get anything from blob-syms */ + /DISCARD/ : { */blob-syms(*) } + .bstext 0 : { *(.bstext) } .bsdata : { *(.bsdata) } @@ -45,7 +48,7 @@ SECTIONS /DISCARD/ : { *(.note*) } - . = ALIGN(512); /* align to sector size */ + . = ALIGN(4096); /* align to page size */ _setup_size = . - _start; _setup_sects = _setup_size / 512; ==================================================================--- a/arch/i386/kernel/asm-offsets.c +++ b/arch/i386/kernel/asm-offsets.c @@ -109,6 +109,8 @@ void foo(void) DEFINE(PTRS_PER_PTE, PTRS_PER_PTE); DEFINE(PTRS_PER_PMD, PTRS_PER_PMD); DEFINE(PTRS_PER_PGD, PTRS_PER_PGD); + DEFINE(PMD_SIZE, PMD_SIZE); + DEFINE(PGDIR_SIZE, PGDIR_SIZE); DEFINE(VDSO_PRELINK_asm, VDSO_PRELINK); ==================================================================--- a/arch/i386/kernel/head.S +++ b/arch/i386/kernel/head.S @@ -49,17 +49,33 @@ * * This should be a multiple of a page. */ -LOW_PAGES = 1<<(32-PAGE_SHIFT_asm) - +LOW_PAGES = ((-__PAGE_OFFSET) >> PAGE_SHIFT_asm) + + +#define ROUNDUP(x,y) (((x) + (y) - 1) & ~((y)-1)) + +/* number of pages needed to map x bytes */ #if PTRS_PER_PMD > 1 -PAGE_TABLE_SIZE = (LOW_PAGES / PTRS_PER_PMD) + PTRS_PER_PGD +#define MAPPING_SIZE(x) (ROUNDUP(x, PMD_SIZE) / PTRS_PER_PMD + PTRS_PER_PGD) #else -PAGE_TABLE_SIZE = (LOW_PAGES / PTRS_PER_PGD) -#endif +#define MAPPING_SIZE(x) (ROUNDUP(x, PGDIR_SIZE) / PTRS_PER_PGD) +#endif + +PAGE_TABLE_SIZE = MAPPING_SIZE(LOW_PAGES) BOOTBITMAP_SIZE = LOW_PAGES / 8 ALLOCATOR_SLOP = 4 INIT_MAP_BEYOND_END = BOOTBITMAP_SIZE + (PAGE_TABLE_SIZE + ALLOCATOR_SLOP)*PAGE_SIZE_asm + +/* + * Where the kernel's initial load-time mapping must end for this code + * to get started. This includes the kernel text+data+bss+enough + * pg0 space to map INIT_MAP_BEYOND_END. This is only really needed for + * bootloaders which start the kernel with paging enabled, which may + * not use this pagetable setup anyway, but its good to be consistent. + */ +.globl pg0_init_size +pg0_init_size = ROUNDUP(INIT_MAP_BEYOND_END / 1024, PAGE_SIZE_asm) /* * 32-bit kernel entrypoint; only used by the boot CPU. On entry, ==================================================================--- a/arch/i386/kernel/vmlinux.lds.S +++ b/arch/i386/kernel/vmlinux.lds.S @@ -199,7 +199,9 @@ SECTIONS /* This is where the kernel creates the early boot page tables */ . = ALIGN(4096); pg0 = . ; - } + __kernel_end = pg0 + pg0_init_size - LOAD_OFFSET - LOAD_PHYSICAL_ADDR ; + } + __kernel_data_size = __init_end - _text ; /* Sections to be discarded */ /DISCARD/ : { ==================================================================--- a/arch/x86_64/boot/compressed/Makefile +++ b/arch/x86_64/boot/compressed/Makefile @@ -6,19 +6,33 @@ # Note all the files here are compiled/linked as 32bit executables. # -targets := blob vmlinux.bin vmlinux.bin.gz head.o misc.o piggy.o +targets := blob vmlinux.bin vmlinux.bin.gz \ + elfhdr.o head.o misc.o piggy.o CFLAGS := -m64 -D__KERNEL__ $(LINUXINCLUDE) -O2 \ -fno-strict-aliasing -fPIC -mcmodel=small \ $(call cc-option, -ffreestanding) \ $(call cc-option, -fno-stack-protector) AFLAGS := $(CFLAGS) -D__ASSEMBLY__ -LDFLAGS := -m elf_x86_64 -LDFLAGS_blob := -T -$(obj)/blob: $(src)/vmlinux.lds $(obj)/head.o $(obj)/misc.o $(obj)/piggy.o FORCE +LDFLAGS := -m elf_x86_64 -R $(obj)/vmlinux-syms --defsym IMAGE_OFFSET=$(IMAGE_OFFSET) -T +OBJS=$(addprefix $(obj)/,elfhdr.o head.o misc.o notes.o piggy.o) + +$(obj)/elfhdr.o: $(src)/../../../i386/boot/compressed/elfhdr.c +$(obj)/notes.o: $(src)/../../../i386/boot/compressed/notes.c + +$(obj)/blob: $(src)/vmlinux.lds $(obj)/vmlinux-syms $(OBJS) FORCE $(call if_changed,ld) @: + +EXTRACTSYMS_blob-syms := blob_entry_64 blob_entry_32 +EXTRACTOUTPUT_blob-syms := --32 +$(obj)/blob-syms: $(obj)/blob FORCE + $(call if_changed,symextract) + +EXTRACTSYMS_vmlinux-syms := __kernel_end __kernel_data_size +$(obj)/vmlinux-syms: vmlinux FORCE + $(call if_changed,symextract) $(obj)/vmlinux.bin: vmlinux FORCE $(call if_changed,objcopy) ==================================================================--- /dev/null +++ b/arch/x86_64/boot/compressed/elfhdr.c @@ -0,0 +1,1 @@ +#include "../../../i386/boot/compressed/elfhdr.c" ==================================================================--- a/arch/x86_64/boot/compressed/head.S +++ b/arch/x86_64/boot/compressed/head.S @@ -32,9 +32,9 @@ .section ".text.head" .code32 - .globl startup_32 - -startup_32: + .globl blob_startup_32 + +blob_startup_32: cld cli movl $(__KERNEL_DS), %eax @@ -158,7 +158,7 @@ 1: movl %eax, 0(%edi) * used to perform that far jump. */ pushl $__KERNEL_CS - leal startup_64(%ebp), %eax + leal blob_startup_64(%ebp), %eax pushl %eax /* Enter paged protected Mode, activating Long Mode */ @@ -183,7 +183,7 @@ 1: */ .code64 .org 0x200 -ENTRY(startup_64) +ENTRY(blob_startup_64) /* We come here either from startup_32 or directly from a * 64bit bootloader. If we come here from a bootloader we depend on * an identity mapped page table being provied that maps our @@ -207,7 +207,7 @@ ENTRY(startup_64) /* Start with the delta to where the kernel will run at. */ #ifdef CONFIG_RELOCATABLE - leaq startup_32(%rip) /* - $startup_32 */, %rbp + leaq blob_startup_32(%rip) /* - $startup_32 */, %rbp addq $(LARGE_PAGE_SIZE - 1), %rbp andq $LARGE_PAGE_MASK, %rbp movq %rbp, %rbx ==================================================================--- a/arch/x86_64/boot/compressed/vmlinux.lds +++ b/arch/x86_64/boot/compressed/vmlinux.lds @@ -1,6 +1,6 @@ OUTPUT_FORMAT("elf64-x86-64") OUTPUT_FORMAT("elf64-x86-64") OUTPUT_ARCH(i386:x86-64) -ENTRY(startup_64) + SECTIONS { /* Be careful parts of head.S assume startup_32 is at @@ -8,7 +8,15 @@ SECTIONS */ . = 0; .text : { + *(.elf.ehdr) + *(.elf.phdr) + _notes = . ; + *(.note*) + _notes_end = .; + _head = . ; + blob_entry_32 = blob_startup_32 + IMAGE_OFFSET; + blob_entry_64 = blob_startup_64 + IMAGE_OFFSET; *(.text.head) _ehead = . ; *(.text.compressed) @@ -29,6 +37,7 @@ SECTIONS *(.data.*) _edata = . ; } + blob_filesz = . ; .bss : { _bss = . ; *(.bss) @@ -41,4 +50,14 @@ SECTIONS . = . + 4096 * 6; _heap = .; } + blob_notesz = _notes_end - _notes; + + /* How much memory we need for decompression: */ + blob_needs = . - IMAGE_OFFSET + /* compressed data + decompressor */ + __kernel_data_size + /* uncompressed data */ + (__kernel_data_size / 0x8000 * 8) + 0x8000 + 18; /* overhead */ + + /* Memory we need to reserve in PHDR: + max of our needs and kernel's needs */ + blob_memsz = blob_needs > __kernel_end ? blob_needs : __kernel_end; } ==================================================================--- a/arch/x86_64/kernel/vmlinux.lds.S +++ b/arch/x86_64/kernel/vmlinux.lds.S @@ -232,6 +232,9 @@ SECTIONS _end = . ; + __kernel_end = _end - LOAD_OFFSET; + __kernel_data_size = __init_end - _text; + /* Sections to be discarded */ /DISCARD/ : { *(.exitcall.exit) ==================================================================--- a/include/asm-x86_64/elf-defines.h +++ b/include/asm-x86_64/elf-defines.h @@ -1,1 +1,6 @@ #include <asm-generic/elf64-defines.h> +#define ELF_DATA ELFDATA2LSB + +#ifndef ELF_ARCH +#define ELF_ARCH EM_X86_64 +#endif ==================================================================--- a/include/asm-x86_64/elf.h +++ b/include/asm-x86_64/elf.h @@ -40,8 +40,6 @@ typedef struct user_i387_struct elf_fpre * These are used to set parameters in the core dumps. */ #include <asm/elf-defines.h> -#define ELF_DATA ELFDATA2LSB -#define ELF_ARCH EM_X86_64 #ifdef __KERNEL__ #include <asm/processor.h> ==================================================================--- a/scripts/Makefile.lib +++ b/scripts/Makefile.lib @@ -163,3 +163,14 @@ cmd_gzip = gzip -f -9 < $< > $@ cmd_gzip = gzip -f -9 < $< > $@ +# Symbol extraction +# --------------------------------------------------------------------------- +# Generate a stripped-down object including only the symbols needed +# so that we can get them with ld -R. +empty:+space:=$(empty) $(empty) +quiet_cmd_symextract = SYMEXT $@ + cmd_symextract = $(NM) -tx $< | \ + gawk -v 'pattern=$(subst $(space),|,$(strip $(EXTRACTSYMS) $(EXTRACTSYMS_$(@F))))' \ + '$$3~(pattern) {printf ".globl %s;%s=0x%s\n",$$3,$$3,$$1}' \ + | $(AS) $(EXTRACTOUTPUT_$(@F)) -o $@ --
Jeremy Fitzhardinge
2007-Jun-20 16:56 UTC
[PATCH 9/9] xen: use boot protocol to boot xen kernel
Boot a Xen kernel using the boot protocol. There are two parts to this: 1. Add Xen-specific notes to the bzImage's internal ELF file, so that the Xen domain builder knows what to do with it. This is simply a matter of adding a new notes-xen.S to the image. The notes depend on the config options, but they contain no addresses, so there's no concern about relocation, or references into the kernel image itself. 2. Do the early setup after booting, mainly to remap the kernel to the proper virtual address. The kernel initially comes up with a P=V 1:1 mapping. We need to copy the appropriate internal pagetable pointers to get the kernel also mapped at __PAGE_OFFSET. In order to simplify this process, we just keep the same pte pages, and only update the pgd/pmd entries (depending on whether its PAE or not, and whether the kernel and Xen want to share the same pgd slot). A pre-requisite for updating the pagetables is setting up the hypercall page in order to do hypercalls. Rather than using the Xen Notes to set this up (which would require a relocatable reference from the bzImage notes into the kernel), we use the Xen reserved MSR to set the page address. Once the kernel has been relocated, we update some of the pointers in the start_info to kernel virtual addresses, and then jump to xen_start_kernel() to do the rest of the setup before calling start_kernel() proper. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> --- arch/i386/boot/compressed/Makefile | 4 arch/i386/boot/compressed/notes-xen.c | 17 ++ arch/i386/kernel/head.S | 2 arch/i386/xen/Makefile | 2 arch/i386/xen/early.c | 216 +++++++++++++++++++++++++++++++++ arch/i386/xen/enlighten.c | 29 +--- arch/i386/xen/xen-head.S | 38 ----- arch/i386/xen/xen-ops.h | 1 include/asm-i386/xen/hypercall.h | 4 9 files changed, 252 insertions(+), 61 deletions(-) ==================================================================--- a/arch/i386/boot/compressed/Makefile +++ b/arch/i386/boot/compressed/Makefile @@ -5,7 +5,7 @@ # targets := blob vmlinux.bin vmlinux.bin.gz \ - elfhdr.o head.o misc.o notes.o piggy.o \ + elfhdr.o head.o misc.o notes.o notes-xen.o piggy.o \ vmlinux.bin.all vmlinux.relocs hostprogs-y := relocs @@ -16,7 +16,7 @@ CFLAGS := -m32 -D__KERNEL__ $(LINUX_INC $(call cc-option,-fno-stack-protector) LDFLAGS := -R $(obj)/vmlinux-syms --defsym IMAGE_OFFSET=$(IMAGE_OFFSET) -T -OBJS=$(addprefix $(obj)/,elfhdr.o head.o misc.o notes.o piggy.o) +OBJS=$(addprefix $(obj)/,elfhdr.o head.o misc.o notes.o notes-xen.o piggy.o) $(obj)/blob: $(src)/vmlinux.lds $(obj)/vmlinux-syms $(OBJS) FORCE $(call if_changed,ld) ==================================================================--- /dev/null +++ b/arch/i386/boot/compressed/notes-xen.c @@ -0,0 +1,17 @@ +#ifdef CONFIG_XEN +#include <linux/elfnote.h> +#include <xen/interface/elfnote.h> + +ELFNOTE("Xen", XEN_ELFNOTE_GUEST_OS, "linux"); +ELFNOTE("Xen", XEN_ELFNOTE_GUEST_VERSION, "2.6"); +ELFNOTE("Xen", XEN_ELFNOTE_XEN_VERSION, "xen-3.0"); +ELFNOTE("Xen", XEN_ELFNOTE_FEATURES, + "!writable_page_tables|pae_pgdir_above_4gb"); +ELFNOTE("Xen", XEN_ELFNOTE_LOADER, "generic"); + +#ifdef CONFIG_X86_PAE + ELFNOTE("Xen", XEN_ELFNOTE_PAE_MODE, "yes"); +#else + ELFNOTE("Xen", XEN_ELFNOTE_PAE_MODE, "no"); +#endif +#endif ==================================================================--- a/arch/i386/kernel/head.S +++ b/arch/i386/kernel/head.S @@ -594,8 +594,6 @@ fault_msg: .ascii "Int %d: CR2 %p err %p EIP %p CS %p flags %p\n" .asciz "Stack: %p %p %p %p %p %p %p %p\n" -#include "../xen/xen-head.S" - /* * The IDT and GDT 'descriptors' are a strange 48-bit object * only used by the lidt and lgdt instructions. They are not ==================================================================--- a/arch/i386/xen/Makefile +++ b/arch/i386/xen/Makefile @@ -1,4 +1,4 @@ obj-y := enlighten.o setup.o features.o -obj-y := enlighten.o setup.o features.o multicalls.o mmu.o \ +obj-y := early.o enlighten.o setup.o features.o multicalls.o mmu.o \ events.o time.o manage.o xen-asm.o obj-$(CONFIG_SMP) += smp.o ==================================================================--- /dev/null +++ b/arch/i386/xen/early.c @@ -0,0 +1,216 @@ +/* + * Very earliest code, run before we're in the kernel virtual address + * space. As a result, we need to be careful about touching static + * symbols or any absolute address. + */ + +#include <linux/types.h> +#include <linux/bug.h> +#include <linux/sched.h> + +#include <asm/bootparam.h> +#include <asm/setup.h> +#include <asm/paravirt.h> +#include <asm/pgtable.h> + +#include <xen/interface/xen.h> +#include <xen/page.h> +#include <asm/xen/interface.h> +#include <asm/xen/hypercall.h> + +#include "xen-ops.h" + +#define PA(ptr) ((typeof(ptr)) __pa_symbol((ptr))) + +extern char _end[]; + +static inline void early_cpuid(unsigned int *eax, unsigned int *ebx, + unsigned int *ecx, unsigned int *edx) +{ + asm(XEN_EMULATE_PREFIX "cpuid" + : "=a" (*eax), + "=b" (*ebx), + "=c" (*ecx), + "=d" (*edx) + : "0" (*eax), "2" (*ecx)); +} + +static __init u64 early_p2m(unsigned long *mfn_list, + unsigned long phys) +{ + unsigned offset = phys & ~PAGE_MASK; + return PFN_PHYS((u64)mfn_list[PFN_DOWN(phys)]) + offset; +} + +static __init unsigned long early_m2p(unsigned long mfn) +{ + unsigned long ret = machine_to_phys_mapping[mfn]; + if (ret == ~0) + ret = 0; + return ret; +} + +static __init void setup_hypercall_page(struct start_info *info) +{ + unsigned long *mfn_list = (unsigned long *)info->mfn_list; + unsigned eax, ebx, ecx, edx; + unsigned long hypercall_mfn; + + /* leaf 0x40000000 is a virtual machine leaf */ + eax = 0x40000000; + ecx = 0; + early_cpuid(&eax, &ebx, &ecx, &edx); + + /* No way we should be able to get here without being under Xen */ + if (ebx != 0x566e6558 || /* Signature 1: "XenV" */ + ecx != 0x65584d4d || /* Signature 2: "MMXe" */ + edx != 0x4d4d566e || /* Signature 3: "nVMM" */ + eax < 0x40000002) + BUG(); + + /* Get the number of hypercall pages (we only need 1) and the + Xen MSR base */ + eax = 0x40000002; + early_cpuid(&eax, &ebx, &ecx, &edx); + + /* Use magic msr to set the address of the hypercall page */ + hypercall_mfn = PFN_DOWN(__pa_symbol(hypercall_page)); + if (mfn_list) + hypercall_mfn = mfn_list[hypercall_mfn]; + + native_write_msr(ebx + 0, ((u64)hypercall_mfn) << PAGE_SHIFT); +} + +static __init pmd_t *get_pmd(pgd_t *pgd, unsigned long addr) +{ + unsigned idx = pgd_index(addr); + pmd_t *ret; + +#ifdef CONFIG_X86_PAE + { + unsigned long pfn; + pfn = PFN_DOWN(pgd_val_ma(pgd[idx])); + pfn = machine_to_phys_mapping[pfn]; + + ret = ((pmd_t *)PFN_PHYS(pfn)) + pmd_index(addr); + } +#else + ret = (pmd_t *)&pgd[idx]; +#endif + + return ret; +} + +static __init void copy_mapping(struct start_info *info, void *src, void *dst) +{ + unsigned long *mfn_list = (unsigned long *)info->mfn_list; + struct mmu_update u; + + u.ptr = early_p2m(mfn_list, (unsigned long)dst); + u.val = pte_val_ma(*(pte_t *)src); + + if (HYPERVISOR_mmu_update(&u, 1, NULL, DOMID_SELF) != 0) + BUG(); +} + +static __init void remap_addr_pmd(struct start_info *info, unsigned long addr) +{ + pgd_t *pgd = (pgd_t *)info->pt_base; + pmd_t *src = get_pmd(pgd, addr); + pmd_t *dst = get_pmd(pgd, addr + __PAGE_OFFSET); + + copy_mapping(info, src, dst); +} + +static __init void remap_kernel_pmd(struct start_info *info, + unsigned long addr, unsigned long max) +{ + while (addr < max) { + remap_addr_pmd(info, addr); + addr += PMD_SIZE; + } +} + +static __init void remap_addr_pgd(struct start_info *info, unsigned long addr) +{ + pgd_t *pgd = (pgd_t *)info->pt_base; + pgd_t *src = &pgd[pgd_index(addr)]; + pgd_t *dst = &pgd[pgd_index(addr + __PAGE_OFFSET)]; + + copy_mapping(info, src, dst); +} + +static __init void remap_kernel_pgd(struct start_info *info, + unsigned long addr, unsigned long max) +{ + while (addr < max) { + remap_addr_pgd(info, addr); + addr += PGDIR_SIZE; + } +} + +static __init void remap_kernel(struct start_info *info, + unsigned long start, unsigned long max) +{ + pgd_t *pgd = (pgd_t *)info->pt_base; + +#ifdef CONFIG_X86_PAE + /* + * If we're running PAE, the kernel will probably want to be + * mapped into the same pgd slot as Xen. If so, we need to + * copy the pmd entries rather than the pgd entries. If not, + * either the kernel doesn't want to be at the top of the + * address space, or Xen has decided not to reserve any space; + * in that case, we can just clone the pgd entries. + */ + if (pgd_val_ma(pgd[pgd_index(__PAGE_OFFSET)]) != 0) { + remap_kernel_pmd(info, start, max); + return; + } +#endif + + remap_kernel_pgd(info, start, max); +} + +void __init xen_entry(void) +{ + struct start_info *info; + unsigned long limit; + + info = (struct start_info *)(unsigned long) + (PA(&boot_params)->hdr.hardware_subarch_data); + + BUG_ON(memcmp(info->magic, PA(&"xen-3.0"), 7) != 0); + + /* establish a hypercall page */ + setup_hypercall_page(info); + + /* work out how far we need to remap */ + limit = __pa(_end); + limit = max(limit, info->pt_base + (info->nr_pt_frames * PAGE_SIZE)); + limit = max(limit, info->mfn_list + + (info->nr_pages * sizeof(unsigned long))); + if (info->mod_start) + limit = max(limit, info->mod_start + info->mod_len); + limit = max(limit, early_m2p(info->console.domU.mfn) << PAGE_SHIFT); + limit = max(limit, early_m2p(info->store_mfn) << PAGE_SHIFT); + + limit += PAGE_SIZE; + + /* remap the kernel to its virtual address */ + remap_kernel(info, 0, limit); + + /* repoint things to their new virtual addresses */ + info->pt_base = (unsigned long)__va(info->pt_base); + info->mfn_list = (unsigned long)__va(info->mfn_list); + + init_pg_tables_end = limit; + + asm volatile("mov %0,%%esp;" + "push $0;" + "jmp *%1" + : + : "i" (&init_thread_union.stack[THREAD_SIZE/sizeof(long)]), + "r" (xen_start_kernel) + : "memory"); +} ==================================================================--- a/arch/i386/xen/enlighten.c +++ b/arch/i386/xen/enlighten.c @@ -50,6 +50,8 @@ #include "mmu.h" #include "multicalls.h" +struct hypercall_entry hypercall_page[PAGE_SIZE/sizeof(struct hypercall_entry)] + __attribute__((aligned(PAGE_SIZE), section(".bss.page_aligned"))); EXPORT_SYMBOL_GPL(hypercall_page); DEFINE_PER_CPU(enum paravirt_lazy_mode, xen_lazy_mode); @@ -1096,15 +1098,19 @@ static void __init xen_reserve_top(void) reserve_top_address(-top + 2 * PAGE_SIZE); } -/* First C function to be called on Xen boot */ -asmlinkage void __init xen_start_kernel(void) +/* + * This is jumped to by early.c, once we're running in the proper + * kernel virtual address space. + */ +void __init xen_start_kernel(void) { pgd_t *pgd; - if (!xen_start_info) - return; - - BUG_ON(memcmp(xen_start_info->magic, "xen-3.0", 7) != 0); + xen_start_info = (struct start_info *) + __va(boot_params.hdr.hardware_subarch_data); + + /* Get mfn list */ + phys_to_machine_mapping = (unsigned long *)xen_start_info->mfn_list; /* Install Xen paravirt ops */ paravirt_ops = xen_paravirt_ops; @@ -1116,13 +1122,7 @@ asmlinkage void __init xen_start_kernel( xen_setup_features(); - /* Get mfn list */ - if (!xen_feature(XENFEAT_auto_translated_physmap)) - phys_to_machine_mapping = (unsigned long *)xen_start_info->mfn_list; - pgd = (pgd_t *)xen_start_info->pt_base; - - init_pg_tables_end = __pa(pgd) + xen_start_info->nr_pt_frames*PAGE_SIZE; init_mm.pgd = pgd; /* use the Xen pagetables to start */ @@ -1152,11 +1152,6 @@ asmlinkage void __init xen_start_kernel( new_cpu_data.hard_math = 1; new_cpu_data.x86_capability[0] = cpuid_edx(1); - /* Poke various useful things into boot_params */ - LOADER_TYPE = (9 << 4) | 0; - INITRD_START = xen_start_info->mod_start ? __pa(xen_start_info->mod_start) : 0; - INITRD_SIZE = xen_start_info->mod_len; - /* Start the world */ start_kernel(); } ==================================================================--- a/arch/i386/xen/xen-head.S +++ /dev/null @@ -1,38 +0,0 @@ -/* Xen-specific pieces of head.S, intended to be included in the right - place in head.S */ - -#ifdef CONFIG_XEN - -#include <linux/elfnote.h> -#include <asm/boot.h> -#include <xen/interface/elfnote.h> - -.pushsection .init.text,"ax",@progbits -ENTRY(startup_xen) - movl %esi,xen_start_info - cld - movl $(init_thread_union+THREAD_SIZE),%esp - jmp xen_start_kernel -.popsection - -.pushsection ".bss.page_aligned" - .align PAGE_SIZE_asm -ENTRY(hypercall_page) - .skip 0x1000 -.popsection - - ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS, .asciz "linux") - ELFNOTE(Xen, XEN_ELFNOTE_GUEST_VERSION, .asciz "2.6") - ELFNOTE(Xen, XEN_ELFNOTE_XEN_VERSION, .asciz "xen-3.0") - ELFNOTE(Xen, XEN_ELFNOTE_VIRT_BASE, .long __PAGE_OFFSET) - ELFNOTE(Xen, XEN_ELFNOTE_ENTRY, .long startup_xen) - ELFNOTE(Xen, XEN_ELFNOTE_HYPERCALL_PAGE, .long hypercall_page) - ELFNOTE(Xen, XEN_ELFNOTE_FEATURES, .asciz "!writable_page_tables|pae_pgdir_above_4gb") -#ifdef CONFIG_X86_PAE - ELFNOTE(Xen, XEN_ELFNOTE_PAE_MODE, .asciz "yes") -#else - ELFNOTE(Xen, XEN_ELFNOTE_PAE_MODE, .asciz "no") -#endif - ELFNOTE(Xen, XEN_ELFNOTE_LOADER, .asciz "generic") - -#endif /*CONFIG_XEN */ ==================================================================--- a/arch/i386/xen/xen-ops.h +++ b/arch/i386/xen/xen-ops.h @@ -19,6 +19,7 @@ void __init xen_arch_setup(void); void __init xen_arch_setup(void); void __init xen_init_IRQ(void); +void xen_start_kernel(void); void xen_setup_timer(int cpu); void xen_setup_cpu_clockevents(void); unsigned long xen_cpu_khz(void); ==================================================================--- a/include/asm-i386/xen/hypercall.h +++ b/include/asm-i386/xen/hypercall.h @@ -40,7 +40,9 @@ #include <xen/interface/sched.h> #include <xen/interface/physdev.h> -extern struct { char _entry[32]; } hypercall_page[]; +extern struct hypercall_entry { + char _entry[32]; +} hypercall_page[]; #define _hypercall0(type, name) \ ({ \ --
[ This patch depends on the cross-architecture ELF cleanup patch. ] This series updates the boot protocol to 2.07 and uses it to implement paravirtual booting. This allows the bootloader to tell the kernel what kind of hardware/pseudo-hardware environment it's coming up under, and the kernel can use the appropriate boot sequence code. Specifically: - Update the boot protocol to 2.07, which adds fields to specify the hardware subarchitecture and some subarchitecture-specific data. It also specifies a flag to tell the boot code to avoid reloading segment registers and playing with interrupt state, since it may not have a visible gdt and/or may not be running in ring 0. (Note: the segment reload and interrupt flags are conflated into one flag, but they are not really related. We could have two flags, but the "cli" is probably completely redundant anyway, since the bootloader would have to be completely mad to start the kernel with interrupts enabled.) - Change the format of bzImage to contain an ELF file. The initial part of the bzImage is still the boot_params header followed by the 16-bit setup code needed for booting from BIOS. But rather than having the self-decompressing kernel follow as a naked blob of code+data, its actually wrapped in a page-aligned ELF file. This allows the bootloader to extract it and parse it, and use that to know what memory the booting kernel will need initially. Xen and lguest need this because they start the kernel with paging enabled, and so need to know what initial mappings to create. - Modify the kernel boot sequence to: 1. avoid reloading the segment state (gdt and segment registers) if the bootloader asks it to 2. jump to a subarchitecture-specific entrypoint in kernel/head.S. - Add Xen-specific starup code, which mainly remaps the kernel from its P=V identity mapping to the normal PAGE_OFFSET mapping. One open issue is that I haven't made the normal head.S initial pagetable construction code generally reusable. The default boot-on-normal x86 hardware still uses it of course, but other subarchitectures like Voyager and lguest could probably use it as-is, while still needing to do other specialized things. The obvious fix is to make it a callable function, but we don't generally assume there's a stack available at this early stage. It looks like it would be easy to set one up though. As a pre-requisite for all the above, I've also cleaned up the bzImage build process process. I've eliminated the need for the tools/build program, and instead use the linker to do more heavy lifting. I've also removed some somewhat obscure uses of ld and objcopy wrap binary files in ELF .o wrappers, and replaced with with .S files containing .incbin. The downside is that its making a bit more complex use of linker scripts, which always opens scope for finding more binutils bugs. Only one way to find out... Tested to check the generated kernels boot under qemu's internal bootload and grub, as well as booting under Xen (with an appropriate update to the Xen domain builder). TODO: - poke Rusty into implementing the lguest bits - look at Kbuild use in arch/{i386,x86_64}/boot/ J --
This patch uses the updated boot protocol to do paravirtualized boot. If the boot version is >= 2.07, then it will do two things: 1. Check the bootparams loadflags to see if we should reload the segment registers and clear interrupts. This is appropriate for normal native boot and some paravirtualized environments, but inapproprate for others. 2. Check the hardware architecture, and dispatch to the appropriate kernel entrypoint. If the bootloader doesn't set this, then we simply do the normal boot sequence. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Vivek Goyal <vgoyal@in.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> --- arch/i386/boot/compressed/head.S | 14 +++++++++-- arch/i386/boot/compressed/misc.c | 4 +++ arch/i386/boot/header.S | 7 ++++- arch/i386/kernel/head.S | 47 ++++++++++++++++++++++++++++++++++---- 4 files changed, 65 insertions(+), 7 deletions(-) ==================================================================--- a/arch/i386/boot/compressed/head.S +++ b/arch/i386/boot/compressed/head.S @@ -33,14 +33,24 @@ .globl blob_startup_32 blob_startup_32: - cld - cli + /* check to see if KEEP_SEGMENTS flag is meaningful */ + cmpw $0x207, BP_version(%esi) + jb 1f + + /* test KEEP_SEGMENTS flag to see if the bootloader is asking + us to not reload segments */ + testb $(1<<6), BP_loadflags(%esi) + jnz 2f + +1: cli movl $(__BOOT_DS),%eax movl %eax,%ds movl %eax,%es movl %eax,%fs movl %eax,%gs movl %eax,%ss + +2: cld /* Calculate the delta between where we were compiled to run * at and where we were actually loaded at. This can only be done ==================================================================--- a/arch/i386/boot/compressed/misc.c +++ b/arch/i386/boot/compressed/misc.c @@ -245,6 +245,10 @@ static void putstr(const char *s) { int x,y,pos; char c; + + if (RM_SCREEN_INFO.orig_video_mode == 0 && + lines == 0 && cols == 0) + return; x = RM_SCREEN_INFO.orig_x; y = RM_SCREEN_INFO.orig_y; ==================================================================--- a/arch/i386/boot/header.S +++ b/arch/i386/boot/header.S @@ -119,7 +119,7 @@ 1: # Part 2 of the header, from the old setup.S .ascii "HdrS" # header signature - .word 0x0206 # header version number (>= 0x0105) + .word 0x0207 # header version number (>= 0x0105) # or else old loadlin-1.5 will fail) .globl realmode_swtch realmode_swtch: .word 0, 0 # default_switch, SETUPSEG @@ -209,6 +209,11 @@ cmdline_size: .long COMMAND_LINE_SIZ #added with boot protocol #version 2.06 +hardware_subarch: .long 0 # subarchitecture, added with 2.07 + # default to 0 for normal x86 PC + +hardware_subarch_data: .quad 0 + # End of setup header ##################################################### .section ".inittext", "ax" ==================================================================--- a/arch/i386/kernel/head.S +++ b/arch/i386/kernel/head.S @@ -86,28 +86,37 @@ pg0_init_size = ROUNDUP(INIT_MAP_BEYOND_ */ .section .text.head,"ax",@progbits ENTRY(startup_32) + /* check to see if KEEP_SEGMENTS flag is meaningful */ + cmpw $0x207, BP_version(%esi) + jb 1f + + /* test KEEP_SEGMENTS flag to see if the bootloader is asking + us to not reload segments */ + testb $(1<<6), BP_loadflags(%esi) + jnz 2f /* * Set segments to known values. */ - cld - lgdt boot_gdt_descr - __PAGE_OFFSET +1: lgdt boot_gdt_descr - __PAGE_OFFSET movl $(__BOOT_DS),%eax movl %eax,%ds movl %eax,%es movl %eax,%fs movl %eax,%gs +2: /* * Clear BSS first so that there are no surprises... - * No need to cld as DF is already clear from cld above... - */ + */ + cld xorl %eax,%eax movl $__bss_start - __PAGE_OFFSET,%edi movl $__bss_stop - __PAGE_OFFSET,%ecx subl %edi,%ecx shrl $2,%ecx rep ; stosl + /* * Copy bootup parameters out of the way. * Note: %esi still has the pointer to the real-mode data. @@ -135,6 +144,35 @@ 2: movsl 1: +#ifdef CONFIG_PARAVIRT + cmpw $0x207, (boot_params + BP_version - __PAGE_OFFSET) + jb default_entry + + /* Paravirt-compatible boot parameters. Look to see what architecture + we're booting under. */ + movl (boot_params + BP_hardware_subarch - __PAGE_OFFSET), %eax + cmpl $num_subarch_entries, %eax + jae bad_subarch + + movl subarch_entries - __PAGE_OFFSET(,%eax,4), %eax + subl $__PAGE_OFFSET, %eax + jmp *%eax + +bad_subarch: +WEAK(lguest_entry) +WEAK(xen_entry) + /* Unknown implementation; there's really + nothing we can do at this point. */ + ud2a +.data +subarch_entries: + .long default_entry /* normal x86/PC */ + .long lguest_entry /* lguest hypervisor */ + .long xen_entry /* Xen hypervisor */ +num_subarch_entries = (. - subarch_entries) / 4 +.previous +#endif /* CONFIG_PARAVIRT */ + /* * Initialize page tables. This creates a PDE and a set of page * tables, which are located immediately beyond _end. The variable @@ -147,6 +185,7 @@ 1: */ page_pde_offset = (__PAGE_OFFSET >> 20); +default_entry: movl $(pg0 - __PAGE_OFFSET), %edi movl $(swapper_pg_dir - __PAGE_OFFSET), %edx movl $0x007, %eax /* 0x007 = PRESENT+RW+USER */ --
Jeremy Fitzhardinge
2007-Jun-20 16:57 UTC
[PATCH 8/9] ask the hypervisor how much space it needs reserved
Ask the hypervisor how much space it needs reserved, since 32-on-64 doesn't need any space, and it may change in future. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> --- arch/i386/xen/enlighten.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) ==================================================================--- a/arch/i386/xen/enlighten.c +++ b/arch/i386/xen/enlighten.c @@ -1085,6 +1085,17 @@ static const struct machine_ops __initda }; +static void __init xen_reserve_top(void) +{ + unsigned long top = HYPERVISOR_VIRT_START; + struct xen_platform_parameters pp; + + if (HYPERVISOR_xen_version(XENVER_platform_parameters, &pp) == 0) + top = pp.virt_start; + + reserve_top_address(-top + 2 * PAGE_SIZE); +} + /* First C function to be called on Xen boot */ asmlinkage void __init xen_start_kernel(void) { @@ -1134,7 +1145,7 @@ asmlinkage void __init xen_start_kernel( paravirt_ops.kernel_rpl = 0; /* set the limit of our address space */ - reserve_top_address(-HYPERVISOR_VIRT_START + 2 * PAGE_SIZE); + xen_reserve_top(); /* set up basic CPUID stuff */ cpu_detect(&new_cpu_data); --
Proposed updates for version 2.07 of the boot protocol. This includes: load_flags.KEEP_SEGMENTS- flag to request/inhibit segment reloads hardware_subarch - what subarchitecture we're booting under hardware_subarch_data - per-architecture data The intention of these changes is to make booting a paravirtualized kernel work via the normal Linux boot protocol. The intention is that the bzImage payload can be a properly formed ELF file, so that the bootloader can use its ELF notes and Phdrs to get more metadata about the kernel and its requirements. The ELF file could be the uncompressed kernel vmlinux itself; it would only take small buildsystem changes to implement this. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Vivek Goyal <vgoyal@in.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> --- Documentation/i386/boot.txt | 34 +++++++++++++++++++++++++++++++++- arch/i386/kernel/asm-offsets.c | 7 +++++++ include/asm-i386/bootparam.h | 9 +++++++-- 3 files changed, 47 insertions(+), 3 deletions(-) ==================================================================--- a/Documentation/i386/boot.txt +++ b/Documentation/i386/boot.txt @@ -168,6 +168,8 @@ 0234/1 2.05+ relocatable_kernel Whether 0234/1 2.05+ relocatable_kernel Whether kernel is relocatable or not 0235/3 N/A pad2 Unused 0238/4 2.06+ cmdline_size Maximum size of the kernel command line +023C/4 2.07+ hardware_subarch Hardware subarchitecture +0240/8 2.07+ hardware_subarch_data Subarchitecture-specific data (1) For backwards compatibility, if the setup_sects field contains 0, the real value is 4. @@ -204,7 +206,7 @@ boot loaders can ignore those fields. The byte order of all fields is littleendian (this is x86, after all.) -Field name: setup_secs +Field name: setup_sects Type: read Offset/size: 0x1f1/1 Protocol: ALL @@ -356,6 +358,13 @@ Protocol: 2.00+ - If 0, the protected-mode code is loaded at 0x10000. - If 1, the protected-mode code is loaded at 0x100000. + Bit 6 (write): KEEP_SEGMENTS + Protocol: 2.07+ + - if 0, reload the segment registers in the 32bit entry point. + - if 1, do not reload the segment registers in the 32bit entry point. + Assume that %cs %ds %ss %es are all set to flat segments with + a base of 0 (or the equivalent for their environment). + Bit 7 (write): CAN_USE_HEAP Set this bit to 1 to indicate that the value entered in the heap_end_ptr is valid. If this field is clear, some setup code @@ -479,6 +488,29 @@ Protocol: 2.06+ zero. This means that the command line can contain at most cmdline_size characters. With protocol version 2.05 and earlier, the maximum size was 255. + +Field name: hardware_subarch +Type: write +Offset/size: 0x23c/4 +Protocol: 2.07+ + + In a paravirtualized environment the hardware low level architectural + pieces such as interrupt handling, page table handling, and + accessing process control registers needs to be done differently. + + This field allows the bootloader to inform the kernel we are in one + one of those environments. + + 0x00000000 The default x86/PC environment + 0x00000001 lguest + 0x00000002 Xen + +Field name: hardware_subarch_data +Type: write +Offset/size: 0x240/8 +Protocol: 2.07+ + + A pointer to data that is specific to hardware subarch **** THE KERNEL COMMAND LINE ==================================================================--- a/arch/i386/kernel/asm-offsets.c +++ b/arch/i386/kernel/asm-offsets.c @@ -15,6 +15,7 @@ #include <asm/fixmap.h> #include <asm/processor.h> #include <asm/thread_info.h> +#include <asm/bootparam.h> #include <asm/elf.h> #include <xen/interface/xen.h> @@ -143,4 +144,10 @@ void foo(void) OFFSET(LGUEST_PAGES_regs_errcode, lguest_pages, regs.errcode); OFFSET(LGUEST_PAGES_regs, lguest_pages, regs); #endif + + BLANK(); + OFFSET(BP_scratch, boot_params, scratch); + OFFSET(BP_loadflags, boot_params, hdr.loadflags); + OFFSET(BP_hardware_subarch, boot_params, hdr.hardware_subarch); + OFFSET(BP_version, boot_params, hdr.version); } ==================================================================--- a/include/asm-i386/bootparam.h +++ b/include/asm-i386/bootparam.h @@ -24,8 +24,9 @@ struct setup_header { u16 kernel_version; u8 type_of_loader; u8 loadflags; -#define LOADED_HIGH 0x01 -#define CAN_USE_HEAP 0x80 +#define LOADED_HIGH (1<<0) +#define KEEP_SEGMENTS (1<<6) +#define CAN_USE_HEAP (1<<7) u16 setup_move_size; u32 code32_start; u32 ramdisk_image; @@ -37,6 +38,10 @@ struct setup_header { u32 initrd_addr_max; u32 kernel_alignment; u8 relocatable_kernel; + u8 _pad2[3]; + u32 cmdline_size; + u32 hardware_subarch; + u64 hardware_subarch_data; } __attribute__((packed)); struct sys_desc_table { --
This patch cleans up image generation in several ways: - Firstly, it removes tools/build, and uses binutils to do all the final construction of the bzImage. This removes a chunk of code and makes the image generation more flexible, since we can compute various numbers rather than be forced to use fixed constants. - Rename compressed/vmlinux to compressed/blob, to make it a bit clearer that it's the compressed kernel image + decompressor (now all the files named "vmlinux*" are directly derived from the kernel vmlinux). - Rather than using objcopy to wrap the compressed kernel into an object file, simply use the assembler: payload.S does a .bininc of the blob.bin file, which allows us to easily place it into a section, and it makes the Makefile dependency a little clearer. - Similarly, use the same technique to create compressed/piggy.o, which cleans things up even more, since the .S file can also set the input and output_size symbols without further linker script hackery; it also removes a complete linker script. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Vivek Goyal <vgoyal@in.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> --- arch/i386/boot/Makefile | 31 +---- arch/i386/boot/compressed/Makefile | 13 -- arch/i386/boot/compressed/piggy.S | 10 + arch/i386/boot/compressed/vmlinux.lds | 3 arch/i386/boot/compressed/vmlinux.scr | 10 - arch/i386/boot/header.S | 7 - arch/i386/boot/payload.S | 3 arch/i386/boot/setup.ld | 37 ++++-- arch/i386/boot/tools/.gitignore | 1 arch/i386/boot/tools/build.c | 168 ------------------------------- arch/x86_64/boot/compressed/Makefile | 12 -- arch/x86_64/boot/compressed/piggy.S | 10 + arch/x86_64/boot/compressed/vmlinux.lds | 2 arch/x86_64/boot/compressed/vmlinux.scr | 10 - 14 files changed, 73 insertions(+), 244 deletions(-) ==================================================================--- a/arch/i386/boot/Makefile +++ b/arch/i386/boot/Makefile @@ -25,12 +25,13 @@ SVGA_MODE := -DSVGA_MODE=NORMAL_VGA #RAMDISK := -DRAMDISK=512 -targets := vmlinux.bin setup.bin setup.elf zImage bzImage +targets := blob.bin setup.elf zImage bzImage subdir- := compressed setup-y += a20.o apm.o cmdline.o copy.o cpu.o cpucheck.o edd.o -setup-y += header.o main.o mca.o memory.o pm.o pmjump.o -setup-y += printf.o string.o tty.o video.o version.o voyager.o +setup-y += header.o main.o mca.o memory.o payload.o pm.o +setup-y += pmjump.o printf.o string.o tty.o video.o version.o +setup-y += voyager.o # The link order of the video-*.o modules can matter. In particular, # video-vga.o *must* be listed first, followed by video-vesa.o. @@ -39,10 +40,6 @@ setup-y += video-vga.o setup-y += video-vga.o setup-y += video-vesa.o setup-y += video-bios.o - -hostprogs-y := tools/build - -HOSTCFLAGS_build.o := $(LINUXINCLUDE) # --------------------------------------------------------------------------- @@ -65,18 +62,12 @@ AFLAGS := $(CFLAGS) -D__ASSEMBLY__ $(obj)/bzImage: IMAGE_OFFSET := 0x100000 $(obj)/bzImage: EXTRA_CFLAGS := -D__BIG_KERNEL__ $(obj)/bzImage: EXTRA_AFLAGS := $(SVGA_MODE) $(RAMDISK) -D__BIG_KERNEL__ -$(obj)/bzImage: BUILDFLAGS := -b -quiet_cmd_image = BUILD $@ -cmd_image = $(obj)/tools/build $(BUILDFLAGS) $(obj)/setup.bin \ - $(obj)/vmlinux.bin $(ROOT_DEV) > $@ - -$(obj)/zImage $(obj)/bzImage: $(obj)/setup.bin \ - $(obj)/vmlinux.bin $(obj)/tools/build FORCE - $(call if_changed,image) +$(obj)/zImage $(obj)/bzImage: $(obj)/setup.elf FORCE + $(call if_changed,objcopy) @echo 'Kernel: $@ is ready' ' (#'`cat .version`')' -$(obj)/vmlinux.bin: $(obj)/compressed/vmlinux FORCE +$(obj)/blob.bin: $(obj)/compressed/blob FORCE $(call if_changed,objcopy) SETUP_OBJS = $(addprefix $(obj)/,$(setup-y)) @@ -85,12 +76,10 @@ LDFLAGS_setup.elf := -T $(obj)/setup.elf: $(src)/setup.ld $(SETUP_OBJS) FORCE $(call if_changed,ld) -OBJCOPYFLAGS_setup.bin := -O binary +$(obj)/payload.o: EXTRA_AFLAGS := -Wa,-I$(obj) +$(obj)/payload.o: $(src)/payload.S $(obj)/blob.bin -$(obj)/setup.bin: $(obj)/setup.elf FORCE - $(call if_changed,objcopy) - -$(obj)/compressed/vmlinux: FORCE +$(obj)/compressed/blob: FORCE $(Q)$(MAKE) $(build)=$(obj)/compressed IMAGE_OFFSET=$(IMAGE_OFFSET) $@ # Set this if you want to pass append arguments to the zdisk/fdimage/isoimage kernel ==================================================================--- a/arch/i386/boot/compressed/Makefile +++ b/arch/i386/boot/compressed/Makefile @@ -4,11 +4,10 @@ # create a compressed vmlinux image from the original vmlinux # -targets := vmlinux vmlinux.bin vmlinux.bin.gz head.o misc.o piggy.o \ +targets := blob vmlinux.bin vmlinux.bin.gz head.o misc.o piggy.o \ vmlinux.bin.all vmlinux.relocs -EXTRA_AFLAGS := -traditional -LDFLAGS_vmlinux := -T +LDFLAGS_blob := -T hostprogs-y := relocs CFLAGS := -m32 -D__KERNEL__ $(LINUX_INCLUDE) -O2 \ @@ -17,7 +16,7 @@ CFLAGS := -m32 -D__KERNEL__ $(LINUX_INC $(call cc-option,-fno-stack-protector) LDFLAGS := -m elf_i386 -$(obj)/vmlinux: $(src)/vmlinux.lds $(obj)/head.o $(obj)/misc.o $(obj)/piggy.o FORCE +$(obj)/blob: $(src)/vmlinux.lds $(obj)/head.o $(obj)/misc.o $(obj)/piggy.o FORCE $(call if_changed,ld) @: @@ -44,7 +43,5 @@ else $(call if_changed,gzip) endif -LDFLAGS_piggy.o := -r --format binary --oformat elf32-i386 -T - -$(obj)/piggy.o: $(src)/vmlinux.scr $(obj)/vmlinux.bin.gz FORCE - $(call if_changed,ld) +$(obj)/piggy.o: EXTRA_AFLAGS := -Wa,-I$(obj) +$(obj)/piggy.o: $(src)/piggy.S $(obj)/vmlinux.bin.gz ==================================================================--- /dev/null +++ b/arch/i386/boot/compressed/piggy.S @@ -0,0 +1,10 @@ +.section .data.compressed,"a",@progbits + +.globl input_data, input_len, output_len + +input_len: .long input_data_end - input_data + +input_data: +.incbin "vmlinux.bin.gz" +output_len = .-4 +input_data_end: ==================================================================--- a/arch/i386/boot/compressed/vmlinux.lds +++ b/arch/i386/boot/compressed/vmlinux.lds @@ -1,4 +1,4 @@ OUTPUT_FORMAT("elf32-i386", "elf32-i386" -OUTPUT_FORMAT("elf32-i386", "elf32-i386", "elf32-i386") +OUTPUT_FORMAT("elf32-i386") OUTPUT_ARCH(i386) ENTRY(startup_32) SECTIONS @@ -38,6 +38,7 @@ SECTIONS *(.bss) *(.bss.*) *(COMMON) + . = ALIGN(8); _end = . ; } } ==================================================================--- a/arch/i386/boot/compressed/vmlinux.scr +++ /dev/null @@ -1,10 +0,0 @@ -SECTIONS -{ - .data.compressed : { - input_len = .; - LONG(input_data_end - input_data) input_data = .; - *(.data) - output_len = . - 4; - input_data_end = .; - } -} ==================================================================--- a/arch/i386/boot/header.S +++ b/arch/i386/boot/header.S @@ -97,9 +97,9 @@ bugger_off_msg: .section ".header", "a" .globl hdr hdr: -setup_sects: .byte SETUPSECTS +setup_sects: .byte _setup_sects root_flags: .word ROOT_RDONLY -syssize: .long SYSSIZE +syssize: .long kernel_size_para ram_size: .word RAMDISK vid_mode: .word SVGA_MODE root_dev: .word ROOT_DEV @@ -148,7 +148,7 @@ CAN_USE_HEAP = 0x80 # If set, the load .byte LOADED_HIGH #endif -setup_move_size: .word 0x8000 # size to move, when setup is not +setup_move_size: .word _setup_size # size to move, when setup is not # loaded at 0x90000. We will move setup # to 0x90000 then just before jumping # into the kernel. However, only the ==================================================================--- /dev/null +++ b/arch/i386/boot/payload.S @@ -0,0 +1,3 @@ +.section .kernel,"a",@progbits + +.incbin "blob.bin" ==================================================================--- a/arch/i386/boot/setup.ld +++ b/arch/i386/boot/setup.ld @@ -3,18 +3,16 @@ * * Linker script for the i386 setup code */ -OUTPUT_FORMAT("elf32-i386", "elf32-i386", "elf32-i386") +OUTPUT_FORMAT("elf32-i386") OUTPUT_ARCH(i386) ENTRY(_start) SECTIONS { - . = 0; - .bstext : { *(.bstext) } + .bstext 0 : { *(.bstext) } .bsdata : { *(.bsdata) } - . = 497; - .header : { *(.header) } + .header 497 : { *(.header) } .inittext : { *(.inittext) } .initdata : { *(.initdata) } .text : { *(.text*) } @@ -38,16 +36,29 @@ SECTIONS . = ALIGN(16); - __bss_start = .; - .bss : - { - *(.bss) - } - . = ALIGN(16); + .bss ALIGN(16) : { + __bss_start = .; + *(.bss) + . = ALIGN(16); + } _end = .; /DISCARD/ : { *(.note*) } - . = ASSERT(_end <= 0x8000, "Setup too big!"); - . = ASSERT(hdr == 0x1f1, "The setup header has the wrong offset!"); + . = ALIGN(512); /* align to sector size */ + _setup_size = . - _start; + _setup_sects = _setup_size / 512; + + /* compressed kernel data */ + .kernel : { + kernel = .; + *(.kernel) + kernel_end = .; + + } + kernel_size = kernel_end - kernel; + kernel_size_para = (kernel_size + 15) / 16; } + +ASSERT(_end <= 0x8000, "Setup too big!"); +ASSERT(hdr == 0x1f1, "The setup header has the wrong offset!"); ==================================================================--- a/arch/i386/boot/tools/.gitignore +++ /dev/null @@ -1,1 +0,0 @@ -build ==================================================================--- a/arch/i386/boot/tools/build.c +++ /dev/null @@ -1,168 +0,0 @@ -/* - * Copyright (C) 1991, 1992 Linus Torvalds - * Copyright (C) 1997 Martin Mares - * Copyright (C) 2007 H. Peter Anvin - */ - -/* - * This file builds a disk-image from three different files: - * - * - setup: 8086 machine code, sets up system parm - * - system: 80386 code for actual system - * - * It does some checking that all files are of the correct type, and - * just writes the result to stdout, removing headers and padding to - * the right amount. It also writes some system data to stderr. - */ - -/* - * Changes by tytso to allow root device specification - * High loaded stuff by Hans Lermen & Werner Almesberger, Feb. 1996 - * Cross compiling fixes by Gertjan van Wingerde, July 1996 - * Rewritten by Martin Mares, April 1997 - * Substantially overhauled by H. Peter Anvin, April 2007 - */ - -#include <stdio.h> -#include <string.h> -#include <stdlib.h> -#include <stdarg.h> -#include <sys/types.h> -#include <sys/stat.h> -#include <sys/sysmacros.h> -#include <unistd.h> -#include <fcntl.h> -#include <sys/mman.h> -#include <asm/boot.h> - -typedef unsigned char u8; -typedef unsigned short u16; -typedef unsigned long u32; - -#define DEFAULT_MAJOR_ROOT 0 -#define DEFAULT_MINOR_ROOT 0 - -/* Minimal number of setup sectors */ -#define SETUP_SECT_MIN 5 -#define SETUP_SECT_MAX 64 - -/* This must be large enough to hold the entire setup */ -u8 buf[SETUP_SECT_MAX*512]; -int is_big_kernel; - -static void die(const char * str, ...) -{ - va_list args; - va_start(args, str); - vfprintf(stderr, str, args); - fputc('\n', stderr); - exit(1); -} - -static void usage(void) -{ - die("Usage: build [-b] setup system [rootdev] [> image]"); -} - -int main(int argc, char ** argv) -{ - unsigned int i, sz, setup_sectors; - int c; - u32 sys_size; - u8 major_root, minor_root; - struct stat sb; - FILE *file; - int fd; - void *kernel; - - if (argc > 2 && !strcmp(argv[1], "-b")) - { - is_big_kernel = 1; - argc--, argv++; - } - if ((argc < 3) || (argc > 4)) - usage(); - if (argc > 3) { - if (!strcmp(argv[3], "CURRENT")) { - if (stat("/", &sb)) { - perror("/"); - die("Couldn't stat /"); - } - major_root = major(sb.st_dev); - minor_root = minor(sb.st_dev); - } else if (strcmp(argv[3], "FLOPPY")) { - if (stat(argv[3], &sb)) { - perror(argv[3]); - die("Couldn't stat root device."); - } - major_root = major(sb.st_rdev); - minor_root = minor(sb.st_rdev); - } else { - major_root = 0; - minor_root = 0; - } - } else { - major_root = DEFAULT_MAJOR_ROOT; - minor_root = DEFAULT_MINOR_ROOT; - } - fprintf(stderr, "Root device is (%d, %d)\n", major_root, minor_root); - - /* Copy the setup code */ - file = fopen(argv[1], "r"); - if (!file) - die("Unable to open `%s': %m", argv[1]); - c = fread(buf, 1, sizeof(buf), file); - if (ferror(file)) - die("read-error on `setup'"); - if (c < 1024) - die("The setup must be at least 1024 bytes"); - if (buf[510] != 0x55 || buf[511] != 0xaa) - die("Boot block hasn't got boot flag (0xAA55)"); - fclose(file); - - /* Pad unused space with zeros */ - setup_sectors = (c + 511) / 512; - if (setup_sectors < SETUP_SECT_MIN) - setup_sectors = SETUP_SECT_MIN; - i = setup_sectors*512; - memset(buf+c, 0, i-c); - - /* Set the default root device */ - buf[508] = minor_root; - buf[509] = major_root; - - fprintf(stderr, "Setup is %d bytes (padded to %d bytes).\n", c, i); - - /* Open and stat the kernel file */ - fd = open(argv[2], O_RDONLY); - if (fd < 0) - die("Unable to open `%s': %m", argv[2]); - if (fstat(fd, &sb)) - die("Unable to stat `%s': %m", argv[2]); - sz = sb.st_size; - fprintf (stderr, "System is %d kB\n", (sz+1023)/1024); - kernel = mmap(NULL, sz, PROT_READ, MAP_SHARED, fd, 0); - if (kernel == MAP_FAILED) - die("Unable to mmap '%s': %m", argv[2]); - sys_size = (sz + 15) / 16; - if (!is_big_kernel && sys_size > DEF_SYSSIZE) - die("System is too big. Try using bzImage or modules."); - - /* Patch the setup code with the appropriate size parameters */ - buf[0x1f1] = setup_sectors-1; - buf[0x1f4] = sys_size; - buf[0x1f5] = sys_size >> 8; - buf[0x1f6] = sys_size >> 16; - buf[0x1f7] = sys_size >> 24; - - if (fwrite(buf, 1, i, stdout) != i) - die("Writing setup failed"); - - /* Copy the kernel code */ - if (fwrite(kernel, 1, sz, stdout) != sz) - die("Writing kernel failed"); - close(fd); - - /* Everything is OK */ - return 0; -} ==================================================================--- a/arch/x86_64/boot/compressed/Makefile +++ b/arch/x86_64/boot/compressed/Makefile @@ -6,7 +6,7 @@ # Note all the files here are compiled/linked as 32bit executables. # -targets := vmlinux vmlinux.bin vmlinux.bin.gz head.o misc.o piggy.o +targets := blob vmlinux.bin vmlinux.bin.gz head.o misc.o piggy.o CFLAGS := -m64 -D__KERNEL__ $(LINUXINCLUDE) -O2 \ -fno-strict-aliasing -fPIC -mcmodel=small \ @@ -15,8 +15,8 @@ AFLAGS := $(CFLAGS) -D__ASSEMBLY__ AFLAGS := $(CFLAGS) -D__ASSEMBLY__ LDFLAGS := -m elf_x86_64 -LDFLAGS_vmlinux := -T -$(obj)/vmlinux: $(src)/vmlinux.lds $(obj)/head.o $(obj)/misc.o $(obj)/piggy.o FORCE +LDFLAGS_blob := -T +$(obj)/blob: $(src)/vmlinux.lds $(obj)/head.o $(obj)/misc.o $(obj)/piggy.o FORCE $(call if_changed,ld) @: @@ -26,7 +26,5 @@ LDFLAGS_vmlinux := -T $(obj)/vmlinux.bin.gz: $(obj)/vmlinux.bin FORCE $(call if_changed,gzip) -LDFLAGS_piggy.o := -r --format binary --oformat elf64-x86-64 -T - -$(obj)/piggy.o: $(obj)/vmlinux.scr $(obj)/vmlinux.bin.gz FORCE - $(call if_changed,ld) +$(obj)/piggy.o: EXTRA_AFLAGS := -Wa,-I$(obj) +$(obj)/piggy.o: $(src)/piggy.S $(obj)/vmlinux.bin.gz ==================================================================--- /dev/null +++ b/arch/x86_64/boot/compressed/piggy.S @@ -0,0 +1,10 @@ +.section .data.compressed,"a",@progbits + +.globl input_data, input_len, output_len + +input_len: .long input_data_end - input_data + +input_data: +.incbin "vmlinux.bin.gz" +output_len = .-4 +input_data_end: ==================================================================--- a/arch/x86_64/boot/compressed/vmlinux.lds +++ b/arch/x86_64/boot/compressed/vmlinux.lds @@ -1,4 +1,4 @@ OUTPUT_FORMAT("elf64-x86-64", "elf64-x86 -OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64", "elf64-x86-64") +OUTPUT_FORMAT("elf64-x86-64") OUTPUT_ARCH(i386:x86-64) ENTRY(startup_64) SECTIONS ==================================================================--- a/arch/x86_64/boot/compressed/vmlinux.scr +++ /dev/null @@ -1,10 +0,0 @@ -SECTIONS -{ - .text.compressed : { - input_len = .; - LONG(input_data_end - input_data) input_data = .; - *(.data) - output_len = . - 4; - input_data_end = .; - } -} --
Possibly Parallel Threads
- [PATCH 0/9] x86 boot protocol updates
- [PATCH 00/10] paravirt/subarchitecture boot protocol
- [PATCH 00/10] paravirt/subarchitecture boot protocol
- [PATCH 00/10] paravirt/subarchitecture boot protocol
- [PATCH RFC 0/7] proposed updates to boot protocol and paravirt booting