The following patchset is the dump-core take 3. It changes its format into ELF, adds PFN-GMFN table, HVM support, and adds IA64 preliminary chages. The take 2 patch was objected because it added new hypercall. Now I removed it, so I hope that the take 3 becomes acceptable than take 2. The page dumping logic is slight complicated because the file is treated as append only. If seeking back and forth is allowed, it can be simplified a bit. Changes from take 2: - removed memory map related hypercall chages. - removed xc_domain_tranlate_gpfn(). instead, trying to map pages is used. - hvm builder, resotre: set shared_info.arch.max_pfn. - dropped experimental IA64 support. [TODO] - IA64 support extend arch_shared_info for dump-core to know the dump area. update ia64 domain builder to set it up appropreately. Subject: [PATCH 1/2] dump-core take3: hvm domain: set shared_info.arch.max_pfn Subject: [PATCH 2/2] dump-core take3: elf formatify and added PFN-GMFN table -- yamahata _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Isaku Yamahata
2007-Jan-21 07:46 UTC
[Xen-devel] [PATCH 1/2] dump-core take 3: hvm domain: set shared_info.arch.max_pfn
# HG changeset patch # User yamahata@valinux.co.jp # Date 1169363541 -32400 # Node ID e26aa113e059b1c824c43a1f8abf8e493a5696c4 # Parent 7e28a8c150edae62aa1a7db4411eb6efbb96af7e x86 hvm domain builder, restore: set shared_info.arch.max_pfn for dump-core to know the area to dump PATCHNAME: x86_hvm_domain_builder_set_max_pfn Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> diff -r 7e28a8c150ed -r e26aa113e059 tools/libxc/xc_hvm_build.c --- a/tools/libxc/xc_hvm_build.c Sat Jan 20 14:33:43 2007 +0000 +++ b/tools/libxc/xc_hvm_build.c Sun Jan 21 16:12:21 2007 +0900 @@ -236,6 +236,7 @@ static int setup_guest(int xc_handle, /* NB. evtchn_upcall_mask is unused: leave as zero. */ memset(&shared_info->evtchn_mask[0], 0xff, sizeof(shared_info->evtchn_mask)); + shared_info->arch.max_pfn = page_array[nr_pages - 1]; munmap(shared_info, PAGE_SIZE); if ( v_end > HVM_BELOW_4G_RAM_END ) diff -r 7e28a8c150ed -r e26aa113e059 tools/libxc/xc_hvm_restore.c --- a/tools/libxc/xc_hvm_restore.c Sat Jan 20 14:33:43 2007 +0000 +++ b/tools/libxc/xc_hvm_restore.c Sun Jan 21 16:12:21 2007 +0900 @@ -31,6 +31,8 @@ #include <xen/hvm/ioreq.h> #include <xen/hvm/params.h> #include <xen/hvm/e820.h> + +#define SCRATCH_PFN 0xFFFFF /* max mfn of the whole machine */ static unsigned long max_mfn; @@ -90,6 +92,8 @@ int xc_hvm_restore(int xc_handle, int io hvm_domain_context_t hvm_ctxt; unsigned long long v_end, memsize; unsigned long shared_page_nr; + struct xen_add_to_physmap xatp; + shared_info_t *shared_info = NULL; unsigned long mfn, pfn; unsigned int prev_pc, this_pc; @@ -152,6 +156,20 @@ int xc_hvm_restore(int xc_handle, int io p2m[i] = i; for ( i = HVM_BELOW_4G_RAM_END >> PAGE_SHIFT; i < max_pfn; i++ ) p2m[i] += HVM_BELOW_4G_MMIO_LENGTH >> PAGE_SHIFT; + + /* shared-info page. shared_info.arch.max_pfn is used by dump-core */ + xatp.domid = dom; + xatp.space = XENMAPSPACE_shared_info; + xatp.idx = 0; + xatp.gpfn = SCRATCH_PFN; + if ( (xc_memory_op(xc_handle, XENMEM_add_to_physmap, &xatp) != 0) || + ((shared_info = xc_map_foreign_range( + xc_handle, dom, PAGE_SIZE, PROT_READ | PROT_WRITE, + SCRATCH_PFN)) == NULL) ) + goto out; + memset(shared_info, 0, PAGE_SIZE); + shared_info->arch.max_pfn = p2m[max_pfn - 1]; + munmap(shared_info, PAGE_SIZE); /* Allocate memory for HVM guest, skipping VGA hole 0xA0000-0xC0000. */ rc = xc_domain_memory_populate_physmap( -- yamahata _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Isaku Yamahata
2007-Jan-21 07:47 UTC
[Xen-devel] [PATCH 2/2] dump-core take 3: elf formatify and added PFN-GMFN table
# HG changeset patch # User yamahata@valinux.co.jp # Date 1169363647 -32400 # Node ID f25c5d2687b014c7d3ce01debc5f49cdd7e38067 # Parent e26aa113e059b1c824c43a1f8abf8e493a5696c4 Use the guest''s own p2m table instead of xc_get_pfn_list(), which cannot handle PFNs with no MFN. Dump a zeroed page for PFNs with no MFN. Clearly deprecate xc_get_pfn_list(). Do not include a P2M table with HVM domains. Refuse to dump HVM until we can map its pages with PFNs. Signed-off-by: John Levon <john.levon@sun.com> ELF formatified. added PFN-GMFN table. HVM domain support. preliminary IA64 changes. [TODO] Xen/IA64 support. PATCHNAME: xm_dump_core_elf Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> diff -r e26aa113e059 -r f25c5d2687b0 tools/libxc/xc_core.c --- a/tools/libxc/xc_core.c Sun Jan 21 16:12:21 2007 +0900 +++ b/tools/libxc/xc_core.c Sun Jan 21 16:14:07 2007 +0900 @@ -1,10 +1,18 @@ +/* + * Elf format, (pfn, gmfn) table, IA64 support. + * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp> + * VA Linux Systems Japan K.K. + * + */ + #include "xg_private.h" +#include "xc_elf.h" +#include "xc_core.h" #include <stdlib.h> #include <unistd.h> /* number of pages to write at a time */ #define DUMP_INCREMENT (4 * 1024) -#define round_pgup(_p) (((_p)+(PAGE_SIZE-1))&PAGE_MASK) static int copy_from_domain_page(int xc_handle, @@ -21,107 +29,605 @@ copy_from_domain_page(int xc_handle, return 0; } +struct memory_map_entry { + uint64_t addr; + uint64_t size; +}; +typedef struct memory_map_entry memory_map_entry_t; + +#if defined(__i386__) || defined(__x86_64__) +#define ELF_ARCH_DATA ELFDATA2LSB +#if defined (__i386__) +# define ELF_ARCH_MACHINE EM_386 +#else +# define ELF_ARCH_MACHINE EM_X86_64 +#endif + +static int +map_p2m(int xc_handle, xc_dominfo_t *info, xen_pfn_t **live_p2m, + unsigned long *pfnp) +{ + /* Double and single indirect references to the live P2M table */ + xen_pfn_t *live_p2m_frame_list_list = NULL; + xen_pfn_t *live_p2m_frame_list = NULL; + shared_info_t *live_shinfo = NULL; + uint32_t dom = info->domid; + unsigned long max_pfn = 0; + int ret = -1; + int err; + + /* Map the shared info frame */ + live_shinfo = xc_map_foreign_range(xc_handle, dom, PAGE_SIZE, + PROT_READ, info->shared_info_frame); + + if ( !live_shinfo ) + { + PERROR("Couldn''t map live_shinfo"); + goto out; + } + + max_pfn = live_shinfo->arch.max_pfn; + + if ( max_pfn < info->nr_pages ) + { + ERROR("max_pfn < nr_pages -1 (%lx < %lx", max_pfn, info->nr_pages - 1); + goto out; + } + + live_p2m_frame_list_list + xc_map_foreign_range(xc_handle, dom, PAGE_SIZE, PROT_READ, + live_shinfo->arch.pfn_to_mfn_frame_list_list); + + if ( !live_p2m_frame_list_list ) + { + PERROR("Couldn''t map p2m_frame_list_list (errno %d)", errno); + goto out; + } + + live_p2m_frame_list + xc_map_foreign_batch(xc_handle, dom, PROT_READ, + live_p2m_frame_list_list, + P2M_FLL_ENTRIES); + + if ( !live_p2m_frame_list ) + { + PERROR("Couldn''t map p2m_frame_list"); + goto out; + } + + *live_p2m = xc_map_foreign_batch(xc_handle, dom, PROT_READ, + live_p2m_frame_list, + P2M_FL_ENTRIES); + + if ( !live_p2m ) + { + PERROR("Couldn''t map p2m table"); + goto out; + } + + *pfnp = max_pfn; + + + ret = 0; + +out: + err = errno; + + if ( live_shinfo ) + munmap(live_shinfo, PAGE_SIZE); + + if ( live_p2m_frame_list_list ) + munmap(live_p2m_frame_list_list, PAGE_SIZE); + + if ( live_p2m_frame_list ) + munmap(live_p2m_frame_list, P2M_FLL_ENTRIES * PAGE_SIZE); + + errno = err; + return ret; +} + +static int +memory_map_get(int xc_handle, xc_dominfo_t *info, + memory_map_entry_t **mapp, unsigned int *nr_entries) +{ + shared_info_t *live_shinfo = NULL; + uint32_t dom = info->domid; + unsigned long max_pfn = 0; + memory_map_entry_t *map = NULL; + + map = malloc(sizeof(*map)); + if ( !map ) + { + PERROR("Could not allocate memory"); + goto out; + } + + /* Map the shared info frame */ + live_shinfo = xc_map_foreign_range(xc_handle, dom, PAGE_SIZE, + PROT_READ, info->shared_info_frame); + if ( !live_shinfo ) + { + PERROR("Couldn''t map live_shinfo"); + goto out; + } + max_pfn = live_shinfo->arch.max_pfn; + munmap(live_shinfo, PAGE_SIZE); + + map->addr = 0; + map->size = max_pfn << PAGE_SHIFT; + + *mapp = map; + *nr_entries = 1; + return 0; + +out: + if ( map ) + free(map); + return -1; +} + +#elif defined (__ia64__) +#define ELF_ARCH_DATA ELFDATA2LSB +#define ELF_ARCH_MACHINE EM_IA64 + +static int +map_p2m(int xc_handle, xc_dominfo_t *info, xen_pfn_t **live_p2m, + unsigned long *pfnp) +{ + /* + * on ia64, both paravirtualize domain and hvm domain are + * auto_translated_physmap mode + */ + errno = ENOSYS; + return -1; +} + +#include "xc_efi.h" + +static int +memory_map_get(int xc_handle, xc_dominfo_t *info, + memory_map_entry_t **mapp, unsigned int *nr_entries) +{ + /* TODO XXX + * get efi memory descriptor via shared_info and then + * create memory_map_entry_t. + */ + + errno = ENOSYS; + return -1; +} +#else +# error "unsupported architecture" +#endif + +#ifndef ELF_CORE_EFLAGS +#define ELF_CORE_EFLAGS 0 +#endif + +static int +get_phdr(Elf_Phdr **phdr, unsigned int *max_phdr, unsigned int *nr_phdr) +{ + Elf_Phdr *tmp; + + (*nr_phdr)++; + if ( *nr_phdr < *max_phdr ) + return 0; + +#define PHDR_INC 4096 + if ( *max_phdr < PHDR_INC ) + *max_phdr *= 2; + else + *max_phdr += PHDR_INC; + + tmp = realloc(*phdr, *max_phdr * sizeof(Elf_Phdr)); + if ( tmp == NULL ) + return -1; + *phdr = tmp; + return 0; +} + +static void +set_phdr(Elf_Phdr *phdr, unsigned long offset, uint64_t addr, uint64_t size) +{ + memset(phdr, 0, sizeof(*phdr)); + phdr->p_type = PT_LOAD; + phdr->p_flags = PF_X | PF_W | PF_R; + phdr->p_offset = offset; + phdr->p_vaddr = 0; + phdr->p_paddr = addr; + phdr->p_filesz = size; + phdr->p_memsz = size; + phdr->p_align = 0; +} + int xc_domain_dumpcore_via_callback(int xc_handle, uint32_t domid, void *args, dumpcore_rtn_t dump_rtn) { - unsigned long nr_pages; - xen_pfn_t *page_array = NULL; xc_dominfo_t info; - int i, nr_vcpus = 0; + int nr_vcpus = 0; char *dump_mem, *dump_mem_start = NULL; - struct xc_core_header header; vcpu_guest_context_t ctxt[MAX_VIRT_CPUS]; char dummy[PAGE_SIZE]; int dummy_len; - int sts; + int sts = -1; + + unsigned long i; + unsigned long j; + unsigned long nr_pages; + + memory_map_entry_t *memory_map = NULL; + unsigned int nr_memory_map; + unsigned int map_idx; + xen_pfn_t pfn; + + int auto_translated_physmap; + xen_pfn_t *p2m = NULL; + unsigned long max_pfn = 0; + struct p2m *p2m_array = NULL; + + unsigned long nr_pfn_array = 0; + xen_pfn_t *pfn_array = NULL; + + xen_pfn_t last_pfn; + uint64_t size; + + Elf_Ehdr ehdr; + unsigned long filesz; + unsigned long offset; + unsigned long fixup; +#define INIT_PHDR 32 + unsigned int max_phdr; + unsigned int nr_phdr; + Elf_Phdr *phdr; + struct xen_note note; + struct xen_core_header_desc core_header; if ( (dump_mem_start = malloc(DUMP_INCREMENT*PAGE_SIZE)) == NULL ) { PERROR("Could not allocate dump_mem"); - goto error_out; + goto out; } if ( xc_domain_getinfo(xc_handle, domid, 1, &info) != 1 ) { PERROR("Could not get info for domain"); - goto error_out; - } + goto out; + } + +#if defined(__i386__) || defined(__x86_64__) + auto_translated_physmap = 0; + if ( info.hvm ) + auto_translated_physmap = 1; +#elif defined (__ia64__) + auto_translated_physmap = 1; +#else +# error "unsupported archtecture" +#endif if ( domid != info.domid ) { PERROR("Domain %d does not exist", domid); - goto error_out; + goto out; } for ( i = 0; i <= info.max_vcpu_id; i++ ) - if ( xc_vcpu_getcontext(xc_handle, domid, i, &ctxt[nr_vcpus]) == 0) + if ( xc_vcpu_getcontext(xc_handle, domid, i, &ctxt[nr_vcpus]) == 0 ) nr_vcpus++; - + if ( nr_vcpus == 0 ) + { + PERROR("No VCPU context could be grabbed"); + goto out; + } + + /* obtain memory map */ + sts = memory_map_get(xc_handle, &info, &memory_map, &nr_memory_map); + if ( sts != 0 ) + goto out; + nr_pages = info.nr_pages; - - header.xch_magic = info.hvm ? XC_CORE_MAGIC_HVM : XC_CORE_MAGIC; - header.xch_nr_vcpus = nr_vcpus; - header.xch_nr_pages = nr_pages; - header.xch_ctxt_offset = sizeof(struct xc_core_header); - header.xch_index_offset = sizeof(struct xc_core_header) + - sizeof(vcpu_guest_context_t)*nr_vcpus; - dummy_len = (sizeof(struct xc_core_header) + - (sizeof(vcpu_guest_context_t) * nr_vcpus) + - (nr_pages * sizeof(xen_pfn_t))); - header.xch_pages_offset = round_pgup(dummy_len); - - sts = dump_rtn(args, (char *)&header, sizeof(struct xc_core_header)); - if ( sts != 0 ) - goto error_out; - + if ( !auto_translated_physmap ) + { + /* obtain p2m table */ + p2m_array = malloc(nr_pages * sizeof(struct p2m)); + if ( p2m_array == NULL ) + { + PERROR("Could not allocate p2m array"); + goto out; + } + + sts = map_p2m(xc_handle, &info, &p2m, &max_pfn); + if ( sts != 0 ) + goto out; + } + else + { + unsigned long total_pages = 0; + unsigned long pages; + + max_pfn = 0; + for ( map_idx = 0; map_idx < nr_memory_map; map_idx++ ) + { + + pages = memory_map[map_idx].size >> PAGE_SHIFT; + pfn = (memory_map[map_idx].addr >> PAGE_SHIFT) + pages; + if ( max_pfn < pfn ) + max_pfn = pfn; + total_pages += pages; + } + + pfn_array = malloc(total_pages * sizeof(pfn_array[0])); + if ( pfn_array == NULL ) + { + PERROR("Could not allocate pfn array"); + goto out; + } + nr_pfn_array = total_pages; + + total_pages = 0; + for ( map_idx = 0; map_idx < nr_memory_map; map_idx++ ) + { + pages = memory_map[map_idx].size >> PAGE_SHIFT; + pfn = memory_map[map_idx].addr >> PAGE_SHIFT; + for ( i = 0; i < pages; i++ ) + pfn_array[total_pages + i] = pfn + i; + total_pages += pages; + } + } + + memset(&ehdr, 0, sizeof(ehdr)); + ehdr.e_ident[EI_MAG0] = ELFMAG0; + ehdr.e_ident[EI_MAG1] = ELFMAG1; + ehdr.e_ident[EI_MAG2] = ELFMAG2; + ehdr.e_ident[EI_MAG3] = ELFMAG3; + ehdr.e_ident[EI_CLASS] = ELFCLASS; + ehdr.e_ident[EI_DATA] = ELF_ARCH_DATA; + ehdr.e_ident[EI_VERSION] = EV_CURRENT; + ehdr.e_ident[EI_OSABI] = ELFOSABI_SYSV; + ehdr.e_ident[EI_ABIVERSION] = EV_CURRENT; + + ehdr.e_type = ET_CORE; + ehdr.e_machine = ELF_ARCH_MACHINE; + ehdr.e_version = EV_CURRENT; + ehdr.e_entry = 0; + ehdr.e_phoff = sizeof(ehdr); + ehdr.e_shoff = 0; + ehdr.e_flags = ELF_CORE_EFLAGS; + ehdr.e_ehsize = sizeof(ehdr); + ehdr.e_phentsize = sizeof(Elf_Phdr); + /* ehdr.e_phum isn''t know here yet. fill it later */ + ehdr.e_shentsize = 0; + ehdr.e_shnum = 0; + ehdr.e_shstrndx = 0; + + /* create program header */ + nr_phdr = 0; + max_phdr = INIT_PHDR; + phdr = malloc(max_phdr * sizeof(phdr[0])); + if ( phdr == NULL ) + { + PERROR("Could not allocate memory"); + goto out; + } + /* here the number of program header is unknown. fix up offset later. */ + offset = sizeof(ehdr); + + /* note section */ + filesz = sizeof(struct xen_core_header) + /* core header */ + sizeof(struct xen_note) + sizeof(ctxt[0]) * nr_vcpus; /* vcpu context */ + if ( !auto_translated_physmap ) + filesz += sizeof(struct xen_note_p2m) + sizeof(p2m_array[0]) * nr_pages; /* p2m table */ + + + memset(&phdr[nr_phdr], 0, sizeof(phdr[0])); + phdr[nr_phdr].p_type = PT_NOTE; + phdr[nr_phdr].p_flags = 0; + phdr[nr_phdr].p_offset = offset; + phdr[nr_phdr].p_vaddr = 0; + phdr[nr_phdr].p_paddr = 0; + phdr[nr_phdr].p_filesz = filesz; + phdr[nr_phdr].p_memsz = filesz; + phdr[nr_phdr].p_align = 0; + + offset += filesz; + +#define INVALID_PFN (~0UL) +#define SET_PHDR_IF_NECESSARY \ + do { \ + if ( last_pfn != INVALID_PFN && size > 0 ) \ + { \ + sts = get_phdr(&phdr, &max_phdr, &nr_phdr); \ + if ( sts ) \ + goto out; \ + set_phdr(&phdr[nr_phdr], offset, \ + last_pfn << PAGE_SHIFT, size); \ + offset += size; \ + } \ + \ + last_pfn = INVALID_PFN; \ + size = 0; \ + } while (0) + + if ( !auto_translated_physmap ) + { + last_pfn = INVALID_PFN; + size = 0; + + j = 0; + for ( i = 0; i < max_pfn && j < nr_pages; i++ ) + { + if ( last_pfn + (size >> PAGE_SHIFT) != i ) + SET_PHDR_IF_NECESSARY; + + if ( p2m[i] == INVALID_P2M_ENTRY ) + continue; + + if ( last_pfn == INVALID_PFN ) + last_pfn = i; + size += PAGE_SIZE; + + p2m_array[j].pfn = i; + p2m_array[j].gmfn = p2m[i]; + j++; + } + SET_PHDR_IF_NECESSARY; + + if ( j + 1 != nr_pages ) + PERROR("j + 1(%ld) != nr_pages (%ld)", j + 1, nr_pages); + } + else + { + unsigned long total_pages = 0; + j = 0; + for ( map_idx = 0; map_idx < nr_memory_map; map_idx++ ) + { + unsigned long pages; + + pages = memory_map[map_idx].size >> PAGE_SHIFT; + pfn = memory_map[map_idx].addr >> PAGE_SHIFT; + last_pfn = INVALID_PFN; + size = 0; + + for ( i = 0; i < pages; i++ ) + { + void *vaddr; + if ( last_pfn + (size >> PAGE_SHIFT) != pfn + i ) + SET_PHDR_IF_NECESSARY; + + /* try to map page to determin wheter it has underlying page */ + vaddr = xc_map_foreign_range(xc_handle, domid, + PAGE_SIZE, PROT_READ, + pfn_array[total_pages + i]); + if ( vaddr == NULL ) + continue; + munmap(vaddr, PAGE_SIZE); + + if ( last_pfn == INVALID_PFN ) + last_pfn = pfn + i; + size += PAGE_SIZE; + + pfn_array[j] = pfn + i; + j++; + } + SET_PHDR_IF_NECESSARY; + + total_pages += pages; + } + if ( j + 1 != nr_pages ) + PERROR("j + 1(%ld) != nr_pages (%ld)", j + 1, nr_pages); + } + + nr_phdr++; + + /* write out elf header */ + ehdr.e_phnum = nr_phdr; + sts = dump_rtn(args, (char*)&ehdr, sizeof(ehdr)); + if ( sts != 0 ) + goto out; + + fixup = nr_phdr * sizeof(phdr[0]); + /* fix up offset for note section */ + phdr[0].p_offset += fixup; + + dummy_len = ROUNDUP(offset + fixup, PAGE_SHIFT) - (offset + fixup); /* padding length */ + fixup += dummy_len; + /* fix up offset for pages */ + for ( i = 1; i < nr_phdr; i++ ) + phdr[i].p_offset += fixup; + /* write out program header */ + sts = dump_rtn(args, (char*)phdr, nr_phdr * sizeof(phdr[0])); + if ( sts != 0 ) + goto out; + + /* note section */ + memset(¬e, 0, sizeof(note)); + note.namesz = strlen(XEN_NOTES) + 1; + strncpy(note.name, XEN_NOTES, sizeof(note.name)); + + /* note section:xen core header */ + note.descsz = sizeof(core_header); + note.type = NT_XEN_HEADER; + core_header.xch_magic = info.hvm ? XC_CORE_MAGIC_HVM : XC_CORE_MAGIC; + core_header.xch_nr_vcpus = nr_vcpus; + core_header.xch_nr_pages = nr_pages; + core_header.xch_page_size = PAGE_SIZE; + sts = dump_rtn(args, (char*)¬e, sizeof(note)); + if ( sts != 0 ) + goto out; + sts = dump_rtn(args, (char*)&core_header, sizeof(core_header)); + if ( sts != 0 ) + goto out; + + /* note section:xen vcpu prstatus */ + note.descsz = sizeof(ctxt[0]) * nr_vcpus; + note.type = NT_XEN_PRSTATUS; + sts = dump_rtn(args, (char*)¬e, sizeof(note)); + if ( sts != 0 ) + goto out; sts = dump_rtn(args, (char *)&ctxt, sizeof(ctxt[0]) * nr_vcpus); if ( sts != 0 ) - goto error_out; - - if ( (page_array = malloc(nr_pages * sizeof(xen_pfn_t))) == NULL ) - { - IPRINTF("Could not allocate memory\n"); - goto error_out; - } - if ( xc_get_pfn_list(xc_handle, domid, page_array, nr_pages) != nr_pages ) - { - IPRINTF("Could not get the page frame list\n"); - goto error_out; - } - sts = dump_rtn(args, (char *)page_array, nr_pages * sizeof(xen_pfn_t)); - if ( sts != 0 ) - goto error_out; - + goto out; + + /* note section:create p2m table */ + if ( !auto_translated_physmap ) + { + note.descsz = sizeof(p2m_array[0]) * nr_pages; + note.type = NT_XEN_P2M; + sts = dump_rtn(args, (char*)¬e, sizeof(note)); + if ( sts != 0 ) + goto out; + sts = dump_rtn(args, (char *)p2m_array, + sizeof(p2m_array[0]) * nr_pages); + if ( sts != 0 ) + goto out; + } + /* Pad the output data to page alignment. */ memset(dummy, 0, PAGE_SIZE); - sts = dump_rtn(args, dummy, header.xch_pages_offset - dummy_len); - if ( sts != 0 ) - goto error_out; - + sts = dump_rtn(args, dummy, dummy_len); + if ( sts != 0 ) + goto out; + + /* dump pages */ for ( dump_mem = dump_mem_start, i = 0; i < nr_pages; i++ ) { - copy_from_domain_page(xc_handle, domid, page_array[i], dump_mem); + xen_pfn_t gmfn; + if ( !auto_translated_physmap ) + gmfn = p2m_array[i].gmfn; + else + gmfn = pfn_array[i]; + + copy_from_domain_page(xc_handle, domid, gmfn, dump_mem); dump_mem += PAGE_SIZE; if ( ((i + 1) % DUMP_INCREMENT == 0) || ((i + 1) == nr_pages) ) { - sts = dump_rtn(args, dump_mem_start, dump_mem - dump_mem_start); + sts = dump_rtn(args, dump_mem_start, + dump_mem - dump_mem_start); if ( sts != 0 ) - goto error_out; + goto out; dump_mem = dump_mem_start; - } - } - + } + } + + sts = 0; + +out: + if ( p2m ) + { + if ( info.hvm ) + free( p2m ); + else + munmap(p2m, P2M_SIZE); + } free(dump_mem_start); - free(page_array); - return 0; - - error_out: - free(dump_mem_start); - free(page_array); - return -1; + if ( p2m_array != NULL ) + free(p2m_array); + if ( pfn_array != NULL ) + free(pfn_array); + free(phdr); + return sts; } /* Callback args for writing to a local dump file. */ diff -r e26aa113e059 -r f25c5d2687b0 tools/libxc/xc_core.h --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/libxc/xc_core.h Sun Jan 21 16:14:07 2007 +0900 @@ -0,0 +1,81 @@ +/* + * Copyright (c) 2007 Isaku Yamahata <yamahata at valinux co jp> + * VA Linux Systems Japan K.K. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + * + */ + +#ifndef XC_CORE_H +#define XC_CORE_H + +#define XEN_NOTES "XEN CORE" + +/* Notes used in xen core*/ +#define NT_XEN_NOTEBASE 256 /* large enough which isn''t used by others */ +#define NT_XEN_HEADER (NT_XEN_NOTEBASE + 0) +#define NT_XEN_PRSTATUS (NT_XEN_NOTEBASE + 1) +#define NT_XEN_P2M (NT_XEN_NOTEBASE + 2) + + +struct xen_note { + uint32_t namesz; + uint32_t descsz; + uint32_t type; + char name[12]; /* to hold XEN_NOTES and 64bit aligned. + * 8 <= sizeof(XEN_NOTES) < 12 + */ +}; + + +struct xen_core_header_desc { + uint64_t xch_magic; + uint64_t xch_nr_vcpus; + uint64_t xch_nr_pages; + uint64_t xch_page_size; +}; + +struct p2m { + xen_pfn_t pfn; + xen_pfn_t gmfn; +}; + + +struct xen_core_header { + struct xen_note note; + struct xen_core_header_desc core_header; +}; + +struct xen_note_prstatus { + struct xen_note note; + vcpu_guest_context_t ctxt[0]; +}; + +struct xen_note_p2m { + struct xen_note note; + struct p2m p2m[0]; +}; + +#endif /* XC_CORE_H */ + +/* + * Local variables: + * mode: C + * c-set-style: "BSD" + * c-basic-offset: 4 + * tab-width: 4 + * indent-tabs-mode: nil + * End: + */ diff -r e26aa113e059 -r f25c5d2687b0 tools/libxc/xc_efi.h --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/libxc/xc_efi.h Sun Jan 21 16:14:07 2007 +0900 @@ -0,0 +1,68 @@ +#ifndef XC_EFI_H +#define XC_EFI_H + +/* definitions from xen/include/asm-ia64/linux-xen/linux/efi.h */ + +/* + * Extensible Firmware Interface + * Based on ''Extensible Firmware Interface Specification'' version 0.9, April 30, 1999 + * + * Copyright (C) 1999 VA Linux Systems + * Copyright (C) 1999 Walt Drummond <drummond@valinux.com> + * Copyright (C) 1999, 2002-2003 Hewlett-Packard Co. + * David Mosberger-Tang <davidm@hpl.hp.com> + * Stephane Eranian <eranian@hpl.hp.com> + */ + +/* + * Memory map descriptor: + */ + +/* Memory types: */ +#define EFI_RESERVED_TYPE 0 +#define EFI_LOADER_CODE 1 +#define EFI_LOADER_DATA 2 +#define EFI_BOOT_SERVICES_CODE 3 +#define EFI_BOOT_SERVICES_DATA 4 +#define EFI_RUNTIME_SERVICES_CODE 5 +#define EFI_RUNTIME_SERVICES_DATA 6 +#define EFI_CONVENTIONAL_MEMORY 7 +#define EFI_UNUSABLE_MEMORY 8 +#define EFI_ACPI_RECLAIM_MEMORY 9 +#define EFI_ACPI_MEMORY_NVS 10 +#define EFI_MEMORY_MAPPED_IO 11 +#define EFI_MEMORY_MAPPED_IO_PORT_SPACE 12 +#define EFI_PAL_CODE 13 +#define EFI_MAX_MEMORY_TYPE 14 + +/* Attribute values: */ +#define EFI_MEMORY_UC ((u64)0x0000000000000001ULL) /* uncached */ +#define EFI_MEMORY_WC ((u64)0x0000000000000002ULL) /* write-coalescing */ +#define EFI_MEMORY_WT ((u64)0x0000000000000004ULL) /* write-through */ +#define EFI_MEMORY_WB ((u64)0x0000000000000008ULL) /* write-back */ +#define EFI_MEMORY_WP ((u64)0x0000000000001000ULL) /* write-protect */ +#define EFI_MEMORY_RP ((u64)0x0000000000002000ULL) /* read-protect */ +#define EFI_MEMORY_XP ((u64)0x0000000000004000ULL) /* execute-protect */ +#define EFI_MEMORY_RUNTIME ((u64)0x8000000000000000ULL) /* range requires runtime mapping */ +#define EFI_MEMORY_DESCRIPTOR_VERSION 1 + +#define EFI_PAGE_SHIFT 12 + +/* + * For current x86 implementations of EFI, there is + * additional padding in the mem descriptors. This is not + * the case in ia64. Need to have this fixed in the f/w. + */ +typedef struct { + u32 type; + u32 pad; + u64 phys_addr; + u64 virt_addr; + u64 num_pages; + u64 attribute; +#if defined (__i386__) + u64 pad1; +#endif +} efi_memory_desc_t; + +#endif /* XC_EFI_H */ diff -r e26aa113e059 -r f25c5d2687b0 tools/libxc/xenctrl.h --- a/tools/libxc/xenctrl.h Sun Jan 21 16:12:21 2007 +0900 +++ b/tools/libxc/xenctrl.h Sun Jan 21 16:14:07 2007 +0900 @@ -552,6 +552,10 @@ unsigned long xc_translate_foreign_addre unsigned long xc_translate_foreign_address(int xc_handle, uint32_t dom, int vcpu, unsigned long long virt); +/** + * DEPRECATED. Avoid using this, as it does not correctly account for PFNs + * without a backing MFN. + */ int xc_get_pfn_list(int xc_handle, uint32_t domid, xen_pfn_t *pfn_buf, unsigned long max_pfns); diff -r e26aa113e059 -r f25c5d2687b0 tools/libxc/xg_private.h --- a/tools/libxc/xg_private.h Sun Jan 21 16:12:21 2007 +0900 +++ b/tools/libxc/xg_private.h Sun Jan 21 16:14:07 2007 +0900 @@ -119,6 +119,25 @@ typedef unsigned long l4_pgentry_t; (((_a) >> L4_PAGETABLE_SHIFT) & (L4_PAGETABLE_ENTRIES - 1)) #endif +#define ROUNDUP(_x,_w) (((unsigned long)(_x)+(1UL<<(_w))-1) & ~((1UL<<(_w))-1)) + +/* Size in bytes of the P2M (rounded up to the nearest PAGE_SIZE bytes) */ +#define P2M_SIZE ROUNDUP((max_pfn * sizeof(xen_pfn_t)), PAGE_SHIFT) + +/* Number of xen_pfn_t in a page */ +#define fpp (PAGE_SIZE/sizeof(xen_pfn_t)) + +/* Number of entries in the pfn_to_mfn_frame_list_list */ +#define P2M_FLL_ENTRIES (((max_pfn)+(fpp*fpp)-1)/(fpp*fpp)) + +/* Number of entries in the pfn_to_mfn_frame_list */ +#define P2M_FL_ENTRIES (((max_pfn)+fpp-1)/fpp) + +/* Size in bytes of the pfn_to_mfn_frame_list */ +#define P2M_FL_SIZE ((P2M_FL_ENTRIES)*sizeof(unsigned long)) + +#define INVALID_P2M_ENTRY (~0UL) + struct domain_setup_info { uint64_t v_start; diff -r e26aa113e059 -r f25c5d2687b0 tools/libxc/xg_save_restore.h --- a/tools/libxc/xg_save_restore.h Sun Jan 21 16:12:21 2007 +0900 +++ b/tools/libxc/xg_save_restore.h Sun Jan 21 16:14:07 2007 +0900 @@ -82,7 +82,6 @@ static int get_platform_info(int xc_hand */ #define PFN_TO_KB(_pfn) ((_pfn) << (PAGE_SHIFT - 10)) -#define ROUNDUP(_x,_w) (((unsigned long)(_x)+(1UL<<(_w))-1) & ~((1UL<<(_w))-1)) /* @@ -95,25 +94,5 @@ static int get_platform_info(int xc_hand #define M2P_SIZE(_m) ROUNDUP(((_m) * sizeof(xen_pfn_t)), M2P_SHIFT) #define M2P_CHUNKS(_m) (M2P_SIZE((_m)) >> M2P_SHIFT) -/* Size in bytes of the P2M (rounded up to the nearest PAGE_SIZE bytes) */ -#define P2M_SIZE ROUNDUP((max_pfn * sizeof(xen_pfn_t)), PAGE_SHIFT) - -/* Number of xen_pfn_t in a page */ -#define fpp (PAGE_SIZE/sizeof(xen_pfn_t)) - -/* Number of entries in the pfn_to_mfn_frame_list */ -#define P2M_FL_ENTRIES (((max_pfn)+fpp-1)/fpp) - -/* Size in bytes of the pfn_to_mfn_frame_list */ -#define P2M_FL_SIZE ((P2M_FL_ENTRIES)*sizeof(unsigned long)) - -/* Number of entries in the pfn_to_mfn_frame_list_list */ -#define P2M_FLL_ENTRIES (((max_pfn)+(fpp*fpp)-1)/(fpp*fpp)) - /* Returns TRUE if the PFN is currently mapped */ #define is_mapped(pfn_type) (!((pfn_type) & 0x80000000UL)) - -#define INVALID_P2M_ENTRY (~0UL) - - - -- yamahata _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 21/1/07 7:46 am, "Isaku Yamahata" <yamahata@valinux.co.jp> wrote:> The following patchset is the dump-core take 3. > It changes its format into ELF, adds PFN-GMFN table, HVM support, > and adds IA64 preliminary chages.These patches look much better. Can you give some description of your Elf format? If we plan to use Elf for save/restore as well, it would be nice to pick a format that is generalisable to both cases. The use of program headers seems weird (since there is no sensible ''virtual address'' to specify, as the Xen core dump format is not defined in the context of a simple single address space) and is going to be no use for live migration where we need to be able to specify GPFN-GMFN relationships on the fly, presumably in a custom section format. Are there any tools to parse this new dump format, or will we have to wait for the crash utility and xc_ptrace_core() to catch up? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Sun, Jan 21, 2007 at 10:05:42AM +0000, Keir Fraser wrote:> Can you give some description of your Elf format? If we plan to use Elf for > save/restore as well, it would be nice to pick a format that is > generalisable to both cases.I''ll document it.> The use of program headers seems weird (since > there is no sensible ''virtual address'' to specify, as the Xen core dump > format is not defined in the context of a simple single address space) and > is going to be no use for live migration where we need to be able to specify > GPFN-GMFN relationships on the fly, presumably in a custom section format.Hmm. It seems the time to change my mind. So John was right. I''ll change the format to use sections instead of program headers. Before coding, I''d like to clarify sections. (If you have more preferable names, please suggest. I don''t stick to the following names.) - .Xen.core_header - .Xen.vcpu_context Or elf note section should be used for the core header and vcpu context? - .Xen.p2m for non-auto translated physmode - .Xen.pfn for auto translated physmode Or should Xen.p2m with PFN=GMFN be used? - .Xen.pages> Are there any tools to parse this new dump format, or will we have to wait > for the crash utility and xc_ptrace_core() to catch up?No. I''ll work on xc_ptrace_core() unless someone else does. Probably it would be after ia64 support. -- yamahata _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 21/1/07 2:02 pm, "Isaku Yamahata" <yamahata@valinux.co.jp> wrote:> Hmm. It seems the time to change my mind. So John was right. > I''ll change the format to use sections instead of program headers. > > Before coding, I''d like to clarify sections. > (If you have more preferable names, please suggest. > I don''t stick to the following names.) > - .Xen.core_header > - .Xen.vcpu_context > Or elf note section should be used for the core header and vcpu context? > - .Xen.p2m for non-auto translated physmode > - .Xen.pfn for auto translated physmode > Or should Xen.p2m with PFN=GMFN be used? > - .Xen.pagesThis looks fine for core dump format. I think we should go with notes for everything except GPFN-GMFN info and page data. Even those could go in a note as well really (although I suppose it would seem a bit odd). If you pick your own name for core-dump elf notes, do you need to start the type numbers at 256? Seems to me you have your own brand new numbering space if you pick a new name. I think there will have to be differences for save/restore/migrate. For live migration we want to stream the GPFN values (or GPFN-GMFN pairs) at the same time as the actual page data (or in small batches) -- probably we''ll do this in a custom section format where we interleave batches of GPFN/GMFN followed by associated page data. But this makes rather less sense for core dump format where efficient random access to guest physical memory is desirable (and so the format you propose makes more sense). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Sun, Jan 21, 2007 at 03:28:37PM +0000, Keir Fraser wrote:> > Before coding, I''d like to clarify sections. > > (If you have more preferable names, please suggest. > > I don''t stick to the following names.) > > - .Xen.core_header > > - .Xen.vcpu_context > > Or elf note section should be used for the core header and vcpu context? > > - .Xen.p2m for non-auto translated physmode > > - .Xen.pfn for auto translated physmode > > Or should Xen.p2m with PFN=GMFN be used? > > - .Xen.pages > > This looks fine for core dump format. I think we should go with notes for > everything except GPFN-GMFN info and page data. Even those could go in a > note as well really (although I suppose it would seem a bit odd). > If you > pick your own name for core-dump elf notes, do you need to start the type > numbers at 256?No. 256 doesn''t have special meaning. Just a number which doesn''t used. See below.> Seems to me you have your own brand new numbering space if > you pick a new name.Yes in thery. Unfortunately binutils and elfutils don''t see name. They simply check number and report type. So I thought that it was better to avoid the number which is commonly used. -- yamahata _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, 2007-01-22 at 01:23 +0900, Isaku Yamahata wrote:> On Sun, Jan 21, 2007 at 03:28:37PM +0000, Keir Fraser wrote:> > Seems to me you have your own brand new numbering space if > > you pick a new name. > > Yes in thery. > Unfortunately binutils and elfutils don''t see name. > They simply check number and report type.We''ve been treating the "Xen" namespace as our own and already have some notes which overlap other notes in other namespaces. If binutils or other tools have a problem with this then they should be fixed. Since this is just an issue when using readelf etc to display the notes I don''t think it is critical enough to work around like this. Tools which actually manipulate the notes seem to do the right thing. I think the definitions should be in xen/include/public/elfnote.h rather than starting a second list in tools/libxc/xc_core.h. I also think these notes should use the "Xen" namespace which is already used for the boot notes (0x0.......) and kexec support (0x1.......) rather than the "XEN CORE" namespace, possibly using 0x2........ There''s a possibility that you might be able to extend XEN_ELFNOTE_CRASH_REGS/crash_xen_core_t to meet your needs rather than adding NT_XEN_PRSTATUS/xen_note_prstatus. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel