This patch series implement changes to the kexec system in Xen, both along the lines of better supporting 32bit crash kernels on 64bit Xen, and providing more useful information to the crash kernel. Most of these patches have been submitted upstream in the past; They are largely unchanged in this new submission. The main new one is patch 4 which introduces the crashtables mechanism of passing information to the crashdump kernel using ELF CORE notes. The set of patches (ported to xen-4.1) have been tested against a wide range of hardware. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Currently, the buffers for crash notes are allocated per CPU when a KEXEC_CMD_kexec_get_range hypercall is made, referencing the CPU in question. This has certain problems including not being able to allocate the crash buffers if the host is out of memory when crashing. In addition, my forthcoming code to support 32bit kdump kernels on 64bit Xen on large (>64GB) boxes will require some guarentees as to where the crash note buffers are actually allocated in physical memory. This is far easier to sort out at boot time, rather than after dom0 has been booted and potentially using the physical memory required. Therefore, allocate the crash note buffers at boot time. Changes since v5: * Introduce sizeof_cpu_notes to move calculation of note size into a separate location. * Tweak sizeof_note() to return size_t rather than int, as it is a function based upon sizeof(). Changes since v4: * Replace the current cpu crash note scheme of using void pointers and hand calculating the size each time is needed, by a range structure containing a pointer and a size. This removes duplicate times where the size is calculated. * Tweak kexec_get_cpu(). Don''t fail if a cpu is offline because it may already have crash notes, and may be up by the time a crash happens. Split the error conditions up to return ERANGE for an out-of-range cpu request rather than EINVAL. Finally, returning a range of zeros is acceptable, so do this in preference to failing. Changes since v3: * Alter the spinlocks to avoid calling xmalloc/xfree while holding the lock. * Tidy up the coding style used. Changes since v2: * Allocate crash_notes dynamically using nr_cpu_ids at boot time, rather than statically using NR_CPUS. * Fix the incorrect use of signed integers for cpu id. * Fix collateral damage to do_crashdump_trigger() and crashdump_trigger_handler caused by reordering sizeof_note() and setup_note() * Tweak the issue about returing -ENOMEM from kexec_init_cpu_note(). No functional change. * Change kexec_get_cpu() to attempt to allocate crash note buffers in case we have more free memory now than when the pcpu came up. * Now that there are two codepaths possibly allocating crash notes, protect the allocation itself with a spinlock. Changes since v1: * Use cpu hotplug notifiers to handle allocating of the notes buffers rather than assuming the boot state of cpus will be the same as the crash state. * Move crash_notes from being per_cpu. This is because the kdump kernel elf binary put in the crash area is hard coded to physical addresses which the dom0 kernel typically obtains at boot time. If a cpu is offlined, its buffer should not be deallocated because the kdump kernel would read junk when trying to get the crash notes. Similarly, the same problem would occur if the cpu was re-onlined later and its crash notes buffer was allocated elsewhere. * Only attempt to allocate buffers if a crash area has been specified. Else, allocating crash note buffers is a waste of space. Along with this, change the test in kexec_get_cpu to return -EINVAL if no buffers have been allocated. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> diff -r f61120046915 -r 977fa1c61b3e xen/common/kexec.c --- a/xen/common/kexec.c +++ b/xen/common/kexec.c @@ -25,13 +25,19 @@ #include <xen/kexec.h> #include <public/elfnote.h> #include <xsm/xsm.h> +#include <xen/cpu.h> #ifdef CONFIG_COMPAT #include <compat/kexec.h> #endif bool_t kexecing = FALSE; -static DEFINE_PER_CPU_READ_MOSTLY(void *, crash_notes); +/* Memory regions to store the per cpu register state etc. on a crash. */ +typedef struct { Elf_Note * start; size_t size; } crash_note_range_t; +static crash_note_range_t * crash_notes; + +/* Lock to prevent race conditions when allocating the crash note buffers. */ +static DEFINE_SPINLOCK(crash_notes_lock); static Elf_Note *xen_crash_note; @@ -165,13 +171,17 @@ static void one_cpu_only(void) void kexec_crash_save_cpu(void) { int cpu = smp_processor_id(); - Elf_Note *note = per_cpu(crash_notes, cpu); + Elf_Note *note; ELF_Prstatus *prstatus; crash_xen_core_t *xencore; + BUG_ON ( ! crash_notes ); + if ( cpumask_test_and_set_cpu(cpu, &crash_saved_cpus) ) return; + note = crash_notes[cpu].start; + prstatus = (ELF_Prstatus *)ELFNOTE_DESC(note); note = ELFNOTE_NEXT(note); @@ -257,13 +267,6 @@ static struct keyhandler crashdump_trigg .desc = "trigger a crashdump" }; -static __init int register_crashdump_trigger(void) -{ - register_keyhandler(''C'', &crashdump_trigger_keyhandler); - return 0; -} -__initcall(register_crashdump_trigger); - static void setup_note(Elf_Note *n, const char *name, int type, int descsz) { int l = strlen(name) + 1; @@ -273,13 +276,144 @@ static void setup_note(Elf_Note *n, cons n->type = type; } -static int sizeof_note(const char *name, int descsz) +static size_t sizeof_note(const char *name, int descsz) { return (sizeof(Elf_Note) + ELFNOTE_ALIGN(strlen(name)+1) + ELFNOTE_ALIGN(descsz)); } +static size_t sizeof_cpu_notes(const unsigned long cpu) +{ + /* All CPUs present a PRSTATUS and crash_xen_core note. */ + size_t bytes + + sizeof_note("CORE", sizeof(ELF_Prstatus)) + + + sizeof_note("Xen", sizeof(crash_xen_core_t)); + + /* CPU0 also presents the crash_xen_info note. */ + if ( ! cpu ) + bytes = bytes + + sizeof_note("Xen", sizeof(crash_xen_info_t)); + + return bytes; +} + +/* Allocate a crash note buffer for a newly onlined cpu. */ +static int kexec_init_cpu_notes(const unsigned long cpu) +{ + Elf_Note * note = NULL; + int ret = 0; + int nr_bytes = 0; + + BUG_ON( cpu >= nr_cpu_ids || ! crash_notes ); + + /* If already allocated, nothing to do. */ + if ( crash_notes[cpu].start ) + return ret; + + nr_bytes = sizeof_cpu_notes(cpu); + note = xmalloc_bytes(nr_bytes); + + /* Protect the write into crash_notes[] with a spinlock, as this function + * is on a hotplug path and a hypercall path. */ + spin_lock(&crash_notes_lock); + + /* If we are racing with another CPU and it has beaten us, give up + * gracefully. */ + if ( crash_notes[cpu].start ) + { + spin_unlock(&crash_notes_lock); + /* Always return ok, because whether we successfully allocated or not, + * another CPU has successfully allocated. */ + if ( note ) + xfree(note); + } + else + { + crash_notes[cpu].start = note; + crash_notes[cpu].size = nr_bytes; + spin_unlock(&crash_notes_lock); + + /* If the allocation failed, and another CPU did not beat us, give + * up with ENOMEM. */ + if ( ! note ) + ret = -ENOMEM; + /* else all is good so lets set up the notes. */ + else + { + /* Set up CORE note. */ + setup_note(note, "CORE", NT_PRSTATUS, sizeof(ELF_Prstatus)); + note = ELFNOTE_NEXT(note); + + /* Set up Xen CORE note. */ + setup_note(note, "Xen", XEN_ELFNOTE_CRASH_REGS, + sizeof(crash_xen_core_t)); + + if ( ! cpu ) + { + /* Set up Xen Crash Info note. */ + xen_crash_note = note = ELFNOTE_NEXT(note); + setup_note(note, "Xen", XEN_ELFNOTE_CRASH_INFO, + sizeof(crash_xen_info_t)); + } + } + } + + return ret; +} + +static int cpu_callback( + struct notifier_block *nfb, unsigned long action, void *hcpu) +{ + unsigned long cpu = (unsigned long)hcpu; + + /* Only hook on CPU_UP_PREPARE because once a crash_note has been reported + * to dom0, it must keep it around in case of a crash, as the crash kernel + * will be hard coded to the original physical address reported. */ + switch ( action ) + { + case CPU_UP_PREPARE: + /* Ignore return value. If this boot time, -ENOMEM will cause all + * manner of problems elsewhere very soon, and if it is during runtime, + * then failing to allocate crash notes is not a good enough reason to + * fail the CPU_UP_PREPARE */ + kexec_init_cpu_notes(cpu); + break; + default: + break; + } + return NOTIFY_DONE; +} + +static struct notifier_block cpu_nfb = { + .notifier_call = cpu_callback +}; + +static int __init kexec_init(void) +{ + void *cpu = (void *)(unsigned long)smp_processor_id(); + + /* If no crash area, no need to allocate space for notes. */ + if ( !kexec_crash_area.size ) + return 0; + + register_keyhandler(''C'', &crashdump_trigger_keyhandler); + + crash_notes = xmalloc_array(crash_note_range_t, nr_cpu_ids); + if ( ! crash_notes ) + return -ENOMEM; + + memset(crash_notes, 0, sizeof(crash_note_range_t) * nr_cpu_ids); + + cpu_callback(&cpu_nfb, CPU_UP_PREPARE, cpu); + register_cpu_notifier(&cpu_nfb); + return 0; +} +/* The reason for this to be a presmp_initcall as opposed to a regular + * __initcall is to allow the setup of the cpu hotplug handler before APs are + * brought up. */ +presmp_initcall(kexec_init); + static int kexec_get_reserve(xen_kexec_range_t *range) { if ( kexec_crash_area.size > 0 && kexec_crash_area.start > 0) { @@ -294,44 +428,29 @@ static int kexec_get_reserve(xen_kexec_r static int kexec_get_cpu(xen_kexec_range_t *range) { int nr = range->nr; - int nr_bytes = 0; - if ( nr < 0 || nr >= nr_cpu_ids || !cpu_online(nr) ) + if ( nr < 0 || nr >= nr_cpu_ids ) + return -ERANGE; + + if ( ! crash_notes ) return -EINVAL; - nr_bytes += sizeof_note("CORE", sizeof(ELF_Prstatus)); - nr_bytes += sizeof_note("Xen", sizeof(crash_xen_core_t)); + /* Try once again to allocate room for the crash notes. It is just possible + * that more space has become available since we last tried. If space has + * already been allocated, kexec_init_cpu_notes() will return early with 0. + */ + kexec_init_cpu_notes(nr); - /* The Xen info note is included in CPU0''s range. */ - if ( nr == 0 ) - nr_bytes += sizeof_note("Xen", sizeof(crash_xen_info_t)); + /* In the case of still not having enough memory to allocate buffer room, + * returning a range of 0,0 is still valid. */ + if ( crash_notes[nr].start ) + { + range->start = __pa(crash_notes[nr].start); + range->size = crash_notes[nr].size; + } + else + range->start = range->size = 0; - if ( per_cpu(crash_notes, nr) == NULL ) - { - Elf_Note *note; - - note = per_cpu(crash_notes, nr) = xmalloc_bytes(nr_bytes); - - if ( note == NULL ) - return -ENOMEM; - - /* Setup CORE note. */ - setup_note(note, "CORE", NT_PRSTATUS, sizeof(ELF_Prstatus)); - - /* Setup Xen CORE note. */ - note = ELFNOTE_NEXT(note); - setup_note(note, "Xen", XEN_ELFNOTE_CRASH_REGS, sizeof(crash_xen_core_t)); - - if (nr == 0) - { - /* Setup system wide Xen info note. */ - xen_crash_note = note = ELFNOTE_NEXT(note); - setup_note(note, "Xen", XEN_ELFNOTE_CRASH_INFO, sizeof(crash_xen_info_t)); - } - } - - range->start = __pa((unsigned long)per_cpu(crash_notes, nr)); - range->size = nr_bytes; return 0; }
It is very useful for the crashdump kernel to have an idea of what Xen thought was the maximum memory address was, especially if the crashdump kernel is 32bit. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> diff -r 977fa1c61b3e -r 1e79fb200722 xen/arch/x86/mm.c --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -149,6 +149,7 @@ struct domain *dom_xen, *dom_io, *dom_co /* Frame table size in pages. */ unsigned long max_page; unsigned long total_pages; +u64 top_of_mem; unsigned long __read_mostly pdx_group_valid[BITS_TO_LONGS( (FRAMETABLE_SIZE / sizeof(*frame_table) + PDX_GROUP_COUNT - 1) diff -r 977fa1c61b3e -r 1e79fb200722 xen/arch/x86/setup.c --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -1111,6 +1111,7 @@ void __init __start_xen(unsigned long mb nr_pages >> (20 - PAGE_SHIFT), nr_pages << (PAGE_SHIFT - 10)); total_pages = nr_pages; + top_of_mem = e820.map[e820.nr_map-1].addr + e820.map[e820.nr_map-1].size; /* Sanity check for unwanted bloat of certain hypercall structures. */ BUILD_BUG_ON(sizeof(((struct xen_platform_op *)0)->u) !diff -r 977fa1c61b3e -r 1e79fb200722 xen/include/asm-x86/mm.h --- a/xen/include/asm-x86/mm.h +++ b/xen/include/asm-x86/mm.h @@ -296,6 +296,7 @@ int get_superpage(unsigned long mfn, str #endif extern unsigned long max_page; extern unsigned long total_pages; +extern u64 top_of_mem; void init_frametable(void); #define PDX_GROUP_COUNT ((1 << L2_PAGETABLE_SHIFT) / \
Andrew Cooper
2012-Mar-09 14:42 UTC
[PATCH 3 of 4] KEXEC: Allocate crash structures in low memory
On 64bit Xen with 32bit dom0 and crashkernel, xmalloc''ing items such as the CPU crash notes will go into the xenheap, which tends to be in upper memory. This causes problems on machines with more than 64GB (or 4GB if no PAE support) of ram as the crashkernel physically cant access the crash notes. The solution is to force Xen to allocate certain structures in lower memory. This is achieved by introducing two new command line parameters; low_crashinfo and crashinfo_maxaddr. Because of the potential impact on 32bit PV guests, and that this problem does not exist for 64bit dom0 on 64bit Xen, this new functionality defaults to the codebase''s previous behavior, requiring the user to explicitly add extra command line parameters to change the behavior. This patch consists of 3 logically distinct but closely related changes. 1) Add the two new command line parameters. 2) Change crash note allocation to use lower memory when instructed. 3) Change the conring buffer to use lower memory when instructed. There result is that the crash notes and console ring will be placed in lower memory so useful information can be recovered in the case of a crash. Changes since v1: - Patch xen-command-line.markdown to document new options Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> diff -r 1e79fb200722 -r 3669bd0ac6b9 docs/misc/xen-command-line.markdown --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -188,6 +188,13 @@ The optional trailing `x` indicates that ### cpuid\_mask\_xsave\_eax ### cpuidle ### cpuinfo +### crashinfo_maxaddr +> `= <size>` + +> Default: `4G` + +Specify the maximum address to allocate certain strucutres, if used in combination with the `low_crashinfo` command line option. + ### crashkernel ### credit2\_balance\_over ### credit2\_balance\_under @@ -273,6 +280,13 @@ Set the logging level for Xen. Any log The optional `<rate-limited level>` options instructs which severities should be rate limited. +### low\_crashinfo +> `= none | min | all` + +> Default: `none` if not specified at all, or to `min` if `low\_crashinfo` is present without qualification. + +This option is only useful for hosts with a 32bit dom0 kernel, wishing to use kexec functionality in the case of a crash. It represents which data structures should be deliberatly allocated in low memory, so the crash kernel may find find them. Should be used in combination with `crashinfo_maxaddr`. + ### max\_cstate ### max\_gsi\_irqs ### maxcpus diff -r 1e79fb200722 -r 3669bd0ac6b9 xen/arch/x86/setup.c --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -586,6 +586,10 @@ void __init __start_xen(unsigned long mb } cmdline_parse(cmdline); + /* Must be after command line argument parsing and before + * allocing any xenheap structures wanted in lower memory. */ + kexec_early_calculations(); + parse_video_info(); set_current((struct vcpu *)0xfffff000); /* debug sanity */ diff -r 1e79fb200722 -r 3669bd0ac6b9 xen/common/kexec.c --- a/xen/common/kexec.c +++ b/xen/common/kexec.c @@ -36,7 +36,9 @@ bool_t kexecing = FALSE; typedef struct { Elf_Note * start; size_t size; } crash_note_range_t; static crash_note_range_t * crash_notes; -/* Lock to prevent race conditions when allocating the crash note buffers. */ +/* Lock to prevent race conditions when allocating the crash note buffers. + * It also serves to protect calls to alloc_from_crash_heap when allocating + * crash note buffers in lower memory. */ static DEFINE_SPINLOCK(crash_notes_lock); static Elf_Note *xen_crash_note; @@ -62,6 +64,24 @@ static struct { unsigned long size; } ranges[16] __initdata; +/* Low crashinfo mode. Start as INVALID so serveral codepaths can set up + * defaults without needing to know the state of the others. */ +enum low_crashinfo low_crashinfo_mode = LOW_CRASHINFO_INVALID; + +/* This value is only considered if low_crash_mode is set to MIN or ALL, so + * setting a default here is safe. Default to 4GB. This is because the current + * KEXEC_CMD_get_range compat hypercall trucates 64bit pointers to 32 bits. The + * typical usecase for crashinfo_maxaddr will be for 64bit Xen with 32bit dom0 + * and 32bit crash kernel. */ +static paddr_t __initdata crashinfo_maxaddr = 4ULL << 30; + +/* = log base 2 of crashinfo_maxaddr after checking for sanity. Default to + * larger than the entire physical address space. */ +paddr_t crashinfo_maxaddr_bits = 64; + +/* Pointers to keep track of the crash heap region. */ +static void *crash_heap_current = NULL, *crash_heap_end = NULL; + /* * Parse command lines in the format * @@ -140,6 +160,58 @@ static void __init parse_crashkernel(con } custom_param("crashkernel", parse_crashkernel); +/* Parse command lines in the format: + * + * low_crashinfo=[none,min,all] + * + * - none disables the low allocation of crash info. + * - min will allocate enough low information for the crash kernel to be able + * to extract the hypervisor and dom0 message ring buffers. + * - all will allocate additional structures such as domain and vcpu structs + * low so the crash kernel can perform an extended analysis of state. + */ +static void __init parse_low_crashinfo(const char * str) +{ + + if ( !strlen(str) ) + /* default to min if user just specifies "low_crashinfo" */ + low_crashinfo_mode = LOW_CRASHINFO_MIN; + else if ( !strcmp(str, "none" ) ) + low_crashinfo_mode = LOW_CRASHINFO_NONE; + else if ( !strcmp(str, "min" ) ) + low_crashinfo_mode = LOW_CRASHINFO_MIN; + else if ( !strcmp(str, "all" ) ) + low_crashinfo_mode = LOW_CRASHINFO_ALL; + else + { + printk("Unknown low_crashinfo parameter ''%s''. Defaulting to min.\n", str); + low_crashinfo_mode = LOW_CRASHINFO_MIN; + } +} +custom_param("low_crashinfo", parse_low_crashinfo); + +/* Parse command lines in the format: + * + * crashinfo_maxaddr=<addr> + * + * <addr> will be rounded down to the nearest power of two. Defaults to 64G + */ +static void __init parse_crashinfo_maxaddr(const char * str) +{ + u64 addr; + + /* if low_crashinfo_mode is unset, default to min. */ + if ( low_crashinfo_mode == LOW_CRASHINFO_INVALID ) + low_crashinfo_mode = LOW_CRASHINFO_MIN; + + if ( (addr = parse_size_and_unit(str, NULL)) ) + crashinfo_maxaddr = addr; + else + printk("Unable to parse crashinfo_maxaddr. Defaulting to %p\n", + (void*)crashinfo_maxaddr); +} +custom_param("crashinfo_maxaddr", parse_crashinfo_maxaddr); + void __init set_kexec_crash_area_size(u64 system_ram) { unsigned int idx; @@ -298,6 +370,20 @@ static size_t sizeof_cpu_notes(const uns return bytes; } +/* Allocate size_t bytes of space from the previously allocated + * crash heap if the user has requested that crash notes be allocated + * in lower memory. There is currently no case where the crash notes + * should be free()''d. */ +static void * alloc_from_crash_heap(const size_t bytes) +{ + void * ret; + if ( crash_heap_current + bytes > crash_heap_end ) + return NULL; + ret = (void*)crash_heap_current; + crash_heap_current += bytes; + return ret; +} + /* Allocate a crash note buffer for a newly onlined cpu. */ static int kexec_init_cpu_notes(const unsigned long cpu) { @@ -312,7 +398,10 @@ static int kexec_init_cpu_notes(const un return ret; nr_bytes = sizeof_cpu_notes(cpu); - note = xmalloc_bytes(nr_bytes); + + /* If we dont care about the position of allocation, malloc. */ + if ( low_crashinfo_mode == LOW_CRASHINFO_NONE ) + note = xmalloc_bytes(nr_bytes); /* Protect the write into crash_notes[] with a spinlock, as this function * is on a hotplug path and a hypercall path. */ @@ -330,6 +419,11 @@ static int kexec_init_cpu_notes(const un } else { + /* If we care about memory possition, alloc from the crash heap, + * also protected by the crash_notes_lock. */ + if ( low_crashinfo_mode > LOW_CRASHINFO_NONE ) + note = alloc_from_crash_heap(nr_bytes); + crash_notes[cpu].start = note; crash_notes[cpu].size = nr_bytes; spin_unlock(&crash_notes_lock); @@ -389,6 +483,18 @@ static struct notifier_block cpu_nfb = { .notifier_call = cpu_callback }; +void __init kexec_early_calculations(void) +{ + /* If low_crashinfo_mode is still INVALID, neither "low_crashinfo" nor + * "crashinfo_maxaddr" have been specified on the command line, so + * explicitly set to NONE. */ + if ( low_crashinfo_mode == LOW_CRASHINFO_INVALID ) + low_crashinfo_mode = LOW_CRASHINFO_NONE; + + if ( low_crashinfo_mode > LOW_CRASHINFO_NONE ) + crashinfo_maxaddr_bits = fls(crashinfo_maxaddr) - 1; +} + static int __init kexec_init(void) { void *cpu = (void *)(unsigned long)smp_processor_id(); @@ -399,6 +505,29 @@ static int __init kexec_init(void) register_keyhandler(''C'', &crashdump_trigger_keyhandler); + if ( low_crashinfo_mode > LOW_CRASHINFO_NONE ) + { + size_t crash_heap_size; + + /* This calculation is safe even if the machine is booted in + * uniprocessor mode. */ + crash_heap_size = sizeof_cpu_notes(0) + + sizeof_cpu_notes(1) * (nr_cpu_ids - 1); + crash_heap_size = PAGE_ALIGN(crash_heap_size); + + crash_heap_current = alloc_xenheap_pages( + get_order_from_bytes(crash_heap_size), + MEMF_bits(crashinfo_maxaddr_bits) ); + + if ( ! crash_heap_current ) + return -ENOMEM; + + crash_heap_end = crash_heap_current + crash_heap_size; + } + + /* crash_notes may be allocated anywhere Xen can reach in memory. + Only the individual CPU crash notes themselves must be allocated + in lower memory if requested. */ crash_notes = xmalloc_array(crash_note_range_t, nr_cpu_ids); if ( ! crash_notes ) return -ENOMEM; diff -r 1e79fb200722 -r 3669bd0ac6b9 xen/drivers/char/console.c --- a/xen/drivers/char/console.c +++ b/xen/drivers/char/console.c @@ -596,7 +596,7 @@ void __init console_init_postirq(void) opt_conring_size = num_present_cpus() << (9 + xenlog_lower_thresh); order = get_order_from_bytes(max(opt_conring_size, conring_size)); - while ( (ring = alloc_xenheap_pages(order, 0)) == NULL ) + while ( (ring = alloc_xenheap_pages(order, MEMF_bits(crashinfo_maxaddr_bits))) == NULL ) { BUG_ON(order == 0); order--; diff -r 1e79fb200722 -r 3669bd0ac6b9 xen/include/xen/kexec.h --- a/xen/include/xen/kexec.h +++ b/xen/include/xen/kexec.h @@ -25,6 +25,19 @@ void set_kexec_crash_area_size(u64 syste #define KEXEC_IMAGE_CRASH_BASE 2 #define KEXEC_IMAGE_NR 4 +enum low_crashinfo { + LOW_CRASHINFO_INVALID = 0, + LOW_CRASHINFO_NONE = 1, + LOW_CRASHINFO_MIN = 2, + LOW_CRASHINFO_ALL = 3 +}; + +/* Low crashinfo mode. Start as INVALID so serveral codepaths can set up + * defaults without needing to know the state of the others. */ +extern enum low_crashinfo low_crashinfo_mode; +extern paddr_t crashinfo_maxaddr_bits; +void kexec_early_calculations(void); + int machine_kexec_load(int type, int slot, xen_kexec_image_t *image); void machine_kexec_unload(int type, int slot, xen_kexec_image_t *image); void machine_kexec_reserved(xen_kexec_reserve_t *reservation);
Andrew Cooper
2012-Mar-09 14:42 UTC
[PATCH 4 of 4] KEXEC: Introduce new crashtables interface
Introduce the ''crashtables'' interface as an extensible method of passing data to the crashdump kernel, providing a substantially more flexability than the current method XenServer uses of hacking around with the symbol file. Care is taken to suitably align values, and to expliticly disabiguate pointers from sizes by use of a separate table, despite the raw data being compatabile between the two. In addition, all the details for "low_crashinfo=min" are passed into their respective tables. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> diff -r 3669bd0ac6b9 -r b31818ccf572 xen/common/kexec.c --- a/xen/common/kexec.c +++ b/xen/common/kexec.c @@ -16,6 +16,7 @@ #include <xen/types.h> #include <xen/kexec.h> #include <xen/keyhandler.h> +#include <xen/compile.h> #include <public/kexec.h> #include <xen/cpumask.h> #include <asm/atomic.h> @@ -23,7 +24,9 @@ #include <xen/version.h> #include <xen/console.h> #include <xen/kexec.h> +#include <public/version.h> #include <public/elfnote.h> +#include <public/crashtables.h> #include <xsm/xsm.h> #include <xen/cpu.h> #ifdef CONFIG_COMPAT @@ -41,7 +44,7 @@ static crash_note_range_t * crash_notes; * crash note buffers in lower memory. */ static DEFINE_SPINLOCK(crash_notes_lock); -static Elf_Note *xen_crash_note; +static Elf_Note *xen_crash_note, *xen_stringtab, *xen_valtab, *xen_symtab; static cpumask_t crash_saved_cpus; @@ -82,6 +85,21 @@ paddr_t crashinfo_maxaddr_bits = 64; /* Pointers to keep track of the crash heap region. */ static void *crash_heap_current = NULL, *crash_heap_end = NULL; +char * StringTab = NULL; +size_t StringTabSize = 0; + +static val64tab_t ValTab[] +{ + { XEN_VALTAB_TOP_OF_MEM, 0 }, + { XEN_VALTAB_CONRING_SIZE, 0 } +}; + +static sym64tab_t SymTab[] +{ + { XEN_SYMTAB_CONRING, 0 } +}; + + /* * Parse command lines in the format * @@ -262,12 +280,116 @@ void kexec_crash_save_cpu(void) elf_core_save_regs(&prstatus->pr_reg, xencore); } +/* Round val up to a machine word alignment */ +size_t round_up(size_t val) +{ + return ((val + (sizeof(unsigned long)-1)) & + ~((sizeof (unsigned long))-1)); +} + + +extern xen_commandline_t saved_cmdline; +static int setup_stringtable(void) +{ + char * ptr; + size_t sl = sizeof(unsigned long); + +#define STRING(x) #x +#define INT2STR(x) STRING(x) + +#define XEN_STR_VERSION INT2STR(XEN_VERSION) "." INT2STR(XEN_SUBVERSION) \ + XEN_EXTRAVERSION +#define XEN_STR_COMPILED_BY XEN_COMPILE_BY "@" XEN_COMPILE_HOST "." \ + XEN_COMPILE_DOMAIN + + size_t size + sl + round_up(sizeof(XEN_STR_VERSION)) + + sl + round_up(sizeof(XEN_CHANGESET)) + + sl + round_up(sizeof(XEN_COMPILE_DATE)) + + sl + round_up(sizeof(XEN_STR_COMPILED_BY)) + + sl + round_up(sizeof(XEN_COMPILER)) + + sl + round_up(strlen(saved_cmdline)+1); + + StringTab = xmalloc_bytes(size); + if ( ! StringTab ) + return -ENOMEM; + + StringTabSize = size; + memset(StringTab, 0, size); + + ptr = StringTab; + +#define WRITE_STRTAB_ENTRY_CONST(v, s) \ + *(unsigned long*)ptr = (v); ptr += sl; \ + memcpy(ptr, s, sizeof(s)); ptr += round_up(sizeof(s)) + +#define WRITE_STRTAB_ENTRY_VAR(v, s) \ + *(unsigned long*)ptr = (v); ptr += sl; \ + memcpy(ptr, s, strlen(s)); ptr += round_up(strlen(s)+1) + + WRITE_STRTAB_ENTRY_CONST(XEN_STRINGTAB_VERSION, XEN_STR_VERSION); + WRITE_STRTAB_ENTRY_CONST(XEN_STRINGTAB_CSET, XEN_CHANGESET); + WRITE_STRTAB_ENTRY_CONST(XEN_STRINGTAB_COMPILE_DATE, XEN_COMPILE_DATE); + WRITE_STRTAB_ENTRY_CONST(XEN_STRINGTAB_COMPILED_BY, XEN_STR_COMPILED_BY); + WRITE_STRTAB_ENTRY_CONST(XEN_STRINGTAB_COMPILER, XEN_COMPILER); + WRITE_STRTAB_ENTRY_VAR(XEN_STRINGTAB_CMDLINE, saved_cmdline); + +#undef WRITE_STRTAB_ENTRY_VAR +#undef WRITE_STRTAB_ENTRY_CONST +#undef XEN_STR_COMPILED_BY +#undef XEN_STR_VERSION +#undef INT2STR +#undef STRING + + return 0; +} + +static void write_val64tab(void) +{ + int i; + for ( i=0; i<ARRAY_SIZE(ValTab); ++i ) + { + switch( ValTab[i].id ) + { + case XEN_VALTAB_TOP_OF_MEM: + ValTab[i].val = (uint64_t)top_of_mem; + break; + case XEN_VALTAB_CONRING_SIZE: + ValTab[i].val = (uint64_t)conring_size; + break; + default: + printk(XENLOG_WARNING "Unrecognised ValTab ID 0x%016"PRIx64"\n", + ValTab[i].id); + break; + } + } +} + +static void write_sym64tab(void) +{ + int i; + for ( i=0; i<ARRAY_SIZE(SymTab); ++i ) + { + switch( SymTab[i].id ) + { + case XEN_SYMTAB_CONRING: + SymTab[i].addr = (uint64_t)__pa(conring); + break; + default: + printk(XENLOG_WARNING "Unrecognised SymTab ID 0x%016"PRIx64"\n", + SymTab[i].id); + break; + } + } +} + /* Set up the single Xen-specific-info crash note. */ crash_xen_info_t *kexec_crash_save_info(void) { int cpu = smp_processor_id(); crash_xen_info_t info; crash_xen_info_t *out = (crash_xen_info_t *)ELFNOTE_DESC(xen_crash_note); + char * details; BUG_ON(!cpumask_test_and_set_cpu(cpu, &crash_saved_cpus)); @@ -284,6 +406,17 @@ crash_xen_info_t *kexec_crash_save_info( /* Copy from guaranteed-aligned local copy to possibly-unaligned dest. */ memcpy(out, &info, sizeof(info)); + details = ELFNOTE_DESC(xen_stringtab); + memcpy(details, StringTab, StringTabSize); + + details = ELFNOTE_DESC(xen_valtab); + write_val64tab(); + memcpy(details, ValTab, sizeof(ValTab)); + + details = ELFNOTE_DESC(xen_symtab); + write_sym64tab(); + memcpy(details, SymTab, sizeof(SymTab)); + return out; } @@ -365,7 +498,10 @@ static size_t sizeof_cpu_notes(const uns /* CPU0 also presents the crash_xen_info note. */ if ( ! cpu ) bytes = bytes + - sizeof_note("Xen", sizeof(crash_xen_info_t)); + sizeof_note("Xen", sizeof(crash_xen_info_t)) + + sizeof_note("Xen", StringTabSize) + + sizeof_note("Xen", sizeof(ValTab)) + + sizeof_note("Xen", sizeof(SymTab)); return bytes; } @@ -449,6 +585,21 @@ static int kexec_init_cpu_notes(const un xen_crash_note = note = ELFNOTE_NEXT(note); setup_note(note, "Xen", XEN_ELFNOTE_CRASH_INFO, sizeof(crash_xen_info_t)); + + /* Set up Xen StringTab note. */ + xen_stringtab = note = ELFNOTE_NEXT(note); + setup_note(note, "Xen", XEN_CRASHTABLE_STRINGTAB, + StringTabSize); + + /* Set up Xen ValTab note. */ + xen_valtab = note = ELFNOTE_NEXT(note); + setup_note(note, "Xen", XEN_CRASHTABLE_VAL64TAB, + sizeof(ValTab)); + + /* Set up Xen SymTab note. */ + xen_symtab = note = ELFNOTE_NEXT(note); + setup_note(note, "Xen", XEN_CRASHTABLE_SYM64TAB, + sizeof(SymTab)); } } } @@ -505,6 +656,9 @@ static int __init kexec_init(void) register_keyhandler(''C'', &crashdump_trigger_keyhandler); + if ( setup_stringtable() ) + return -ENOMEM; + if ( low_crashinfo_mode > LOW_CRASHINFO_NONE ) { size_t crash_heap_size; diff -r 3669bd0ac6b9 -r b31818ccf572 xen/drivers/char/console.c --- a/xen/drivers/char/console.c +++ b/xen/drivers/char/console.c @@ -59,8 +59,8 @@ size_param("conring_size", opt_conring_s #define _CONRING_SIZE 16384 #define CONRING_IDX_MASK(i) ((i)&(conring_size-1)) static char __initdata _conring[_CONRING_SIZE]; -static char *__read_mostly conring = _conring; -static uint32_t __read_mostly conring_size = _CONRING_SIZE; +char *__read_mostly conring = _conring; +uint32_t __read_mostly conring_size = _CONRING_SIZE; static uint32_t conringc, conringp; static int __read_mostly sercon_handle = -1; diff -r 3669bd0ac6b9 -r b31818ccf572 xen/include/public/crashtables.h --- /dev/null +++ b/xen/include/public/crashtables.h @@ -0,0 +1,77 @@ +/****************************************************************************** + * crashtables.h + * + * Definitions used for the Xen ELF crash notes. + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to + * deal in the Software without restriction, including without limitation the + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or + * sell copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + * + * Copyright (c) 2011,2012 Andrew Cooper, Citrix Ltd. + */ + +#ifndef __XEN_PUBLIC_CRASHTABLES_H__ +#define __XEN_PUBLIC_CRASHTABLES_H__ + +#include "xen.h" + +/* + * Xen string crash table. + * Note is a list in the form of: + * Machine word (32/64bit) String ID. + * Null terminating string, 0-extended to word alignment. + */ +#define XEN_CRASHTABLE_STRINGTAB 0x3000000 + +#define XEN_STRINGTAB_VERSION 0 +#define XEN_STRINGTAB_CSET 1 +#define XEN_STRINGTAB_COMPILE_DATE 2 +#define XEN_STRINGTAB_COMPILED_BY 3 +#define XEN_STRINGTAB_COMPILER 4 +#define XEN_STRINGTAB_CMDLINE 5 + +/* + * Xen value crash table. + * Note is a list in the form of: + * Machine word (32/64bit) Value ID. + * 64bit value. + */ +#define XEN_CRASHTABLE_VAL64TAB 0x3000001 + +typedef struct { unsigned long id; uint64_t val; } val64tab_t; + +#define XEN_VALTAB_TOP_OF_MEM 0 +#define XEN_VALTAB_CONRING_SIZE 1 + +/* + * Xen symbol crash table + * Note is a list in the form of: + * Machine word (32/64bit) Symbol ID. + * Symbol Address (64bit pointers). + */ +#define XEN_CRASHTABLE_SYM64TAB 0x3000002 + +typedef struct { unsigned long id; uint64_t addr; } sym64tab_t; + +#define XEN_SYMTAB_CONRING 0 + + +#endif /* __XEN_PUBLIC_CRASHTABLES_H__ */ + +/* +TODO: reintroduce local vars here. emacs doesn''t get on well with patch files claiming to be C files + */ diff -r 3669bd0ac6b9 -r b31818ccf572 xen/include/xen/console.h --- a/xen/include/xen/console.h +++ b/xen/include/xen/console.h @@ -10,6 +10,9 @@ #include <xen/inttypes.h> #include <public/xen.h> +extern char * conring; +extern uint32_t conring_size; + struct xen_sysctl_readconsole; long read_console_ring(struct xen_sysctl_readconsole *op);
>>> On 09.03.12 at 15:42, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > +static int __init kexec_init(void) > +{ > + void *cpu = (void *)(unsigned long)smp_processor_id(); > + > + /* If no crash area, no need to allocate space for notes. */ > + if ( !kexec_crash_area.size ) > + return 0; > + > + register_keyhandler(''C'', &crashdump_trigger_keyhandler);Wouldn''t this better be done only after successful crash_notes allocation below?> + > + crash_notes = xmalloc_array(crash_note_range_t, nr_cpu_ids); > + if ( ! crash_notes ) > + return -ENOMEM; > + > + memset(crash_notes, 0, sizeof(crash_note_range_t) * nr_cpu_ids);Using xzalloc_array() above would be preferred.> + > + cpu_callback(&cpu_nfb, CPU_UP_PREPARE, cpu); > + register_cpu_notifier(&cpu_nfb); > + return 0; > +}Looks okay otherwise, but I''m still not fully convinced all this is really needed. Jan
>>> On 09.03.12 at 15:42, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > It is very useful for the crashdump kernel to have an idea of what Xen > thought was the maximum memory address was, especially if the > crashdump kernel is 32bit.But then you also need to update the value upon memory hotplug. (And I assume subsequent patches will actually make use of this.) Jan> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > diff -r 977fa1c61b3e -r 1e79fb200722 xen/arch/x86/mm.c > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -149,6 +149,7 @@ struct domain *dom_xen, *dom_io, *dom_co > /* Frame table size in pages. */ > unsigned long max_page; > unsigned long total_pages; > +u64 top_of_mem; > > unsigned long __read_mostly pdx_group_valid[BITS_TO_LONGS( > (FRAMETABLE_SIZE / sizeof(*frame_table) + PDX_GROUP_COUNT - 1) > diff -r 977fa1c61b3e -r 1e79fb200722 xen/arch/x86/setup.c > --- a/xen/arch/x86/setup.c > +++ b/xen/arch/x86/setup.c > @@ -1111,6 +1111,7 @@ void __init __start_xen(unsigned long mb > nr_pages >> (20 - PAGE_SHIFT), > nr_pages << (PAGE_SHIFT - 10)); > total_pages = nr_pages; > + top_of_mem = e820.map[e820.nr_map-1].addr + e820.map[e820.nr_map-1].size; > > /* Sanity check for unwanted bloat of certain hypercall structures. */ > BUILD_BUG_ON(sizeof(((struct xen_platform_op *)0)->u) !> diff -r 977fa1c61b3e -r 1e79fb200722 xen/include/asm-x86/mm.h > --- a/xen/include/asm-x86/mm.h > +++ b/xen/include/asm-x86/mm.h > @@ -296,6 +296,7 @@ int get_superpage(unsigned long mfn, str > #endif > extern unsigned long max_page; > extern unsigned long total_pages; > +extern u64 top_of_mem; > void init_frametable(void); > > #define PDX_GROUP_COUNT ((1 << L2_PAGETABLE_SHIFT) / \ > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Keir Fraser
2012-Mar-09 15:52 UTC
Re: [PATCH 3 of 4] KEXEC: Allocate crash structures in low memory
On 09/03/2012 14:42, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:> On 64bit Xen with 32bit dom0 and crashkernel, xmalloc''ing items such > as the CPU crash notes will go into the xenheap, which tends to be in > upper memory. This causes problems on machines with more than 64GB > (or 4GB if no PAE support) of ram as the crashkernel physically cant > access the crash notes.Why wouldn''t you run a 64-bit crashkernel? -- Keir> The solution is to force Xen to allocate certain structures in lower > memory. This is achieved by introducing two new command line > parameters; low_crashinfo and crashinfo_maxaddr. Because of the > potential impact on 32bit PV guests, and that this problem does not > exist for 64bit dom0 on 64bit Xen, this new functionality defaults to > the codebase''s previous behavior, requiring the user to explicitly > add extra command line parameters to change the behavior. > > This patch consists of 3 logically distinct but closely related > changes. > 1) Add the two new command line parameters. > 2) Change crash note allocation to use lower memory when instructed. > 3) Change the conring buffer to use lower memory when instructed. > > There result is that the crash notes and console ring will be placed > in lower memory so useful information can be recovered in the case of > a crash. > > Changes since v1: > - Patch xen-command-line.markdown to document new options > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > diff -r 1e79fb200722 -r 3669bd0ac6b9 docs/misc/xen-command-line.markdown > --- a/docs/misc/xen-command-line.markdown > +++ b/docs/misc/xen-command-line.markdown > @@ -188,6 +188,13 @@ The optional trailing `x` indicates that > ### cpuid\_mask\_xsave\_eax > ### cpuidle > ### cpuinfo > +### crashinfo_maxaddr > +> `= <size>` > + > +> Default: `4G` > + > +Specify the maximum address to allocate certain strucutres, if used in > combination with the `low_crashinfo` command line option. > + > ### crashkernel > ### credit2\_balance\_over > ### credit2\_balance\_under > @@ -273,6 +280,13 @@ Set the logging level for Xen. Any log > > The optional `<rate-limited level>` options instructs which severities should > be rate limited. > > +### low\_crashinfo > +> `= none | min | all` > + > +> Default: `none` if not specified at all, or to `min` if `low\_crashinfo` is > present without qualification. > + > +This option is only useful for hosts with a 32bit dom0 kernel, wishing to use > kexec functionality in the case of a crash. It represents which data > structures should be deliberatly allocated in low memory, so the crash kernel > may find find them. Should be used in combination with `crashinfo_maxaddr`. > + > ### max\_cstate > ### max\_gsi\_irqs > ### maxcpus > diff -r 1e79fb200722 -r 3669bd0ac6b9 xen/arch/x86/setup.c > --- a/xen/arch/x86/setup.c > +++ b/xen/arch/x86/setup.c > @@ -586,6 +586,10 @@ void __init __start_xen(unsigned long mb > } > cmdline_parse(cmdline); > > + /* Must be after command line argument parsing and before > + * allocing any xenheap structures wanted in lower memory. */ > + kexec_early_calculations(); > + > parse_video_info(); > > set_current((struct vcpu *)0xfffff000); /* debug sanity */ > diff -r 1e79fb200722 -r 3669bd0ac6b9 xen/common/kexec.c > --- a/xen/common/kexec.c > +++ b/xen/common/kexec.c > @@ -36,7 +36,9 @@ bool_t kexecing = FALSE; > typedef struct { Elf_Note * start; size_t size; } crash_note_range_t; > static crash_note_range_t * crash_notes; > > -/* Lock to prevent race conditions when allocating the crash note buffers. */ > +/* Lock to prevent race conditions when allocating the crash note buffers. > + * It also serves to protect calls to alloc_from_crash_heap when allocating > + * crash note buffers in lower memory. */ > static DEFINE_SPINLOCK(crash_notes_lock); > > static Elf_Note *xen_crash_note; > @@ -62,6 +64,24 @@ static struct { > unsigned long size; > } ranges[16] __initdata; > > +/* Low crashinfo mode. Start as INVALID so serveral codepaths can set up > + * defaults without needing to know the state of the others. */ > +enum low_crashinfo low_crashinfo_mode = LOW_CRASHINFO_INVALID; > + > +/* This value is only considered if low_crash_mode is set to MIN or ALL, so > + * setting a default here is safe. Default to 4GB. This is because the > current > + * KEXEC_CMD_get_range compat hypercall trucates 64bit pointers to 32 bits. > The > + * typical usecase for crashinfo_maxaddr will be for 64bit Xen with 32bit > dom0 > + * and 32bit crash kernel. */ > +static paddr_t __initdata crashinfo_maxaddr = 4ULL << 30; > + > +/* = log base 2 of crashinfo_maxaddr after checking for sanity. Default to > + * larger than the entire physical address space. */ > +paddr_t crashinfo_maxaddr_bits = 64; > + > +/* Pointers to keep track of the crash heap region. */ > +static void *crash_heap_current = NULL, *crash_heap_end = NULL; > + > /* > * Parse command lines in the format > * > @@ -140,6 +160,58 @@ static void __init parse_crashkernel(con > } > custom_param("crashkernel", parse_crashkernel); > > +/* Parse command lines in the format: > + * > + * low_crashinfo=[none,min,all] > + * > + * - none disables the low allocation of crash info. > + * - min will allocate enough low information for the crash kernel to be able > + * to extract the hypervisor and dom0 message ring buffers. > + * - all will allocate additional structures such as domain and vcpu structs > + * low so the crash kernel can perform an extended analysis of state. > + */ > +static void __init parse_low_crashinfo(const char * str) > +{ > + > + if ( !strlen(str) ) > + /* default to min if user just specifies "low_crashinfo" */ > + low_crashinfo_mode = LOW_CRASHINFO_MIN; > + else if ( !strcmp(str, "none" ) ) > + low_crashinfo_mode = LOW_CRASHINFO_NONE; > + else if ( !strcmp(str, "min" ) ) > + low_crashinfo_mode = LOW_CRASHINFO_MIN; > + else if ( !strcmp(str, "all" ) ) > + low_crashinfo_mode = LOW_CRASHINFO_ALL; > + else > + { > + printk("Unknown low_crashinfo parameter ''%s''. Defaulting to min.\n", > str); > + low_crashinfo_mode = LOW_CRASHINFO_MIN; > + } > +} > +custom_param("low_crashinfo", parse_low_crashinfo); > + > +/* Parse command lines in the format: > + * > + * crashinfo_maxaddr=<addr> > + * > + * <addr> will be rounded down to the nearest power of two. Defaults to 64G > + */ > +static void __init parse_crashinfo_maxaddr(const char * str) > +{ > + u64 addr; > + > + /* if low_crashinfo_mode is unset, default to min. */ > + if ( low_crashinfo_mode == LOW_CRASHINFO_INVALID ) > + low_crashinfo_mode = LOW_CRASHINFO_MIN; > + > + if ( (addr = parse_size_and_unit(str, NULL)) ) > + crashinfo_maxaddr = addr; > + else > + printk("Unable to parse crashinfo_maxaddr. Defaulting to %p\n", > + (void*)crashinfo_maxaddr); > +} > +custom_param("crashinfo_maxaddr", parse_crashinfo_maxaddr); > + > void __init set_kexec_crash_area_size(u64 system_ram) > { > unsigned int idx; > @@ -298,6 +370,20 @@ static size_t sizeof_cpu_notes(const uns > return bytes; > } > > +/* Allocate size_t bytes of space from the previously allocated > + * crash heap if the user has requested that crash notes be allocated > + * in lower memory. There is currently no case where the crash notes > + * should be free()''d. */ > +static void * alloc_from_crash_heap(const size_t bytes) > +{ > + void * ret; > + if ( crash_heap_current + bytes > crash_heap_end ) > + return NULL; > + ret = (void*)crash_heap_current; > + crash_heap_current += bytes; > + return ret; > +} > + > /* Allocate a crash note buffer for a newly onlined cpu. */ > static int kexec_init_cpu_notes(const unsigned long cpu) > { > @@ -312,7 +398,10 @@ static int kexec_init_cpu_notes(const un > return ret; > > nr_bytes = sizeof_cpu_notes(cpu); > - note = xmalloc_bytes(nr_bytes); > + > + /* If we dont care about the position of allocation, malloc. */ > + if ( low_crashinfo_mode == LOW_CRASHINFO_NONE ) > + note = xmalloc_bytes(nr_bytes); > > /* Protect the write into crash_notes[] with a spinlock, as this function > * is on a hotplug path and a hypercall path. */ > @@ -330,6 +419,11 @@ static int kexec_init_cpu_notes(const un > } > else > { > + /* If we care about memory possition, alloc from the crash heap, > + * also protected by the crash_notes_lock. */ > + if ( low_crashinfo_mode > LOW_CRASHINFO_NONE ) > + note = alloc_from_crash_heap(nr_bytes); > + > crash_notes[cpu].start = note; > crash_notes[cpu].size = nr_bytes; > spin_unlock(&crash_notes_lock); > @@ -389,6 +483,18 @@ static struct notifier_block cpu_nfb = { > .notifier_call = cpu_callback > }; > > +void __init kexec_early_calculations(void) > +{ > + /* If low_crashinfo_mode is still INVALID, neither "low_crashinfo" nor > + * "crashinfo_maxaddr" have been specified on the command line, so > + * explicitly set to NONE. */ > + if ( low_crashinfo_mode == LOW_CRASHINFO_INVALID ) > + low_crashinfo_mode = LOW_CRASHINFO_NONE; > + > + if ( low_crashinfo_mode > LOW_CRASHINFO_NONE ) > + crashinfo_maxaddr_bits = fls(crashinfo_maxaddr) - 1; > +} > + > static int __init kexec_init(void) > { > void *cpu = (void *)(unsigned long)smp_processor_id(); > @@ -399,6 +505,29 @@ static int __init kexec_init(void) > > register_keyhandler(''C'', &crashdump_trigger_keyhandler); > > + if ( low_crashinfo_mode > LOW_CRASHINFO_NONE ) > + { > + size_t crash_heap_size; > + > + /* This calculation is safe even if the machine is booted in > + * uniprocessor mode. */ > + crash_heap_size = sizeof_cpu_notes(0) + > + sizeof_cpu_notes(1) * (nr_cpu_ids - 1); > + crash_heap_size = PAGE_ALIGN(crash_heap_size); > + > + crash_heap_current = alloc_xenheap_pages( > + get_order_from_bytes(crash_heap_size), > + MEMF_bits(crashinfo_maxaddr_bits) ); > + > + if ( ! crash_heap_current ) > + return -ENOMEM; > + > + crash_heap_end = crash_heap_current + crash_heap_size; > + } > + > + /* crash_notes may be allocated anywhere Xen can reach in memory. > + Only the individual CPU crash notes themselves must be allocated > + in lower memory if requested. */ > crash_notes = xmalloc_array(crash_note_range_t, nr_cpu_ids); > if ( ! crash_notes ) > return -ENOMEM; > diff -r 1e79fb200722 -r 3669bd0ac6b9 xen/drivers/char/console.c > --- a/xen/drivers/char/console.c > +++ b/xen/drivers/char/console.c > @@ -596,7 +596,7 @@ void __init console_init_postirq(void) > opt_conring_size = num_present_cpus() << (9 + xenlog_lower_thresh); > > order = get_order_from_bytes(max(opt_conring_size, conring_size)); > - while ( (ring = alloc_xenheap_pages(order, 0)) == NULL ) > + while ( (ring = alloc_xenheap_pages(order, > MEMF_bits(crashinfo_maxaddr_bits))) == NULL ) > { > BUG_ON(order == 0); > order--; > diff -r 1e79fb200722 -r 3669bd0ac6b9 xen/include/xen/kexec.h > --- a/xen/include/xen/kexec.h > +++ b/xen/include/xen/kexec.h > @@ -25,6 +25,19 @@ void set_kexec_crash_area_size(u64 syste > #define KEXEC_IMAGE_CRASH_BASE 2 > #define KEXEC_IMAGE_NR 4 > > +enum low_crashinfo { > + LOW_CRASHINFO_INVALID = 0, > + LOW_CRASHINFO_NONE = 1, > + LOW_CRASHINFO_MIN = 2, > + LOW_CRASHINFO_ALL = 3 > +}; > + > +/* Low crashinfo mode. Start as INVALID so serveral codepaths can set up > + * defaults without needing to know the state of the others. */ > +extern enum low_crashinfo low_crashinfo_mode; > +extern paddr_t crashinfo_maxaddr_bits; > +void kexec_early_calculations(void); > + > int machine_kexec_load(int type, int slot, xen_kexec_image_t *image); > void machine_kexec_unload(int type, int slot, xen_kexec_image_t *image); > void machine_kexec_reserved(xen_kexec_reserve_t *reservation); > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Jan Beulich
2012-Mar-09 15:55 UTC
Re: [PATCH 3 of 4] KEXEC: Allocate crash structures in low memory
>>> On 09.03.12 at 15:42, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > --- a/xen/drivers/char/console.c > +++ b/xen/drivers/char/console.c > @@ -596,7 +596,7 @@ void __init console_init_postirq(void) > opt_conring_size = num_present_cpus() << (9 + xenlog_lower_thresh); > > order = get_order_from_bytes(max(opt_conring_size, conring_size)); > - while ( (ring = alloc_xenheap_pages(order, 0)) == NULL ) > + while ( (ring = alloc_xenheap_pages(order, MEMF_bits(crashinfo_maxaddr_bits))) == NULL ) > { > BUG_ON(order == 0); > order--;Did you note the loop here? Rather than shrinking the size, I think you ought to ignore the address limit as the first fallback option, in particular when the size was specified on the command line. Jan
Andrew Cooper
2012-Mar-09 16:08 UTC
Re: [PATCH 1 of 4] KEXEC: Allocate crash notes on boot
On 09/03/12 15:41, Jan Beulich wrote:>>>> On 09.03.12 at 15:42, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> +static int __init kexec_init(void) >> +{ >> + void *cpu = (void *)(unsigned long)smp_processor_id(); >> + >> + /* If no crash area, no need to allocate space for notes. */ >> + if ( !kexec_crash_area.size ) >> + return 0; >> + >> + register_keyhandler(''C'', &crashdump_trigger_keyhandler); > Wouldn''t this better be done only after successful crash_notes > allocation below?Yes>> + >> + crash_notes = xmalloc_array(crash_note_range_t, nr_cpu_ids); >> + if ( ! crash_notes ) >> + return -ENOMEM; >> + >> + memset(crash_notes, 0, sizeof(crash_note_range_t) * nr_cpu_ids); > Using xzalloc_array() above would be preferred. >Doh! I started this patch against unstable, with xzalloc, then ported to 4.1 without xzalloc, then ported back to unstable and forgot to change. I will respin it>> + >> + cpu_callback(&cpu_nfb, CPU_UP_PREPARE, cpu); >> + register_cpu_notifier(&cpu_nfb); >> + return 0; >> +} > Looks okay otherwise, but I''m still not fully convinced all this is really > needed. > > Jan >-- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com
Jan Beulich
2012-Mar-09 16:11 UTC
Re: [PATCH 4 of 4] KEXEC: Introduce new crashtables interface
>>> On 09.03.12 at 15:42, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > --- a/xen/common/kexec.c > +++ b/xen/common/kexec.c > @@ -82,6 +85,21 @@ paddr_t crashinfo_maxaddr_bits = 64; > /* Pointers to keep track of the crash heap region. */ > static void *crash_heap_current = NULL, *crash_heap_end = NULL; > > +char * StringTab = NULL; > +size_t StringTabSize = 0;static? Naming convention?> + > +static val64tab_t ValTab[]const? Naming convention?> +{ > + { XEN_VALTAB_TOP_OF_MEM, 0 }, > + { XEN_VALTAB_CONRING_SIZE, 0 } > +}; > + > +static sym64tab_t SymTab[]const? Naming convention?> +{ > + { XEN_SYMTAB_CONRING, 0 } > +}; > + > + > /* > * Parse command lines in the format > * > @@ -262,12 +280,116 @@ void kexec_crash_save_cpu(void) > elf_core_save_regs(&prstatus->pr_reg, xencore); > } > > +/* Round val up to a machine word alignment */ > +size_t round_up(size_t val) > +{ > + return ((val + (sizeof(unsigned long)-1)) & > + ~((sizeof (unsigned long))-1)); > +} > + > + > +extern xen_commandline_t saved_cmdline; > +static int setup_stringtable(void) > +{ > + char * ptr; > + size_t sl = sizeof(unsigned long); > + > +#define STRING(x) #x > +#define INT2STR(x) STRING(x) > + > +#define XEN_STR_VERSION INT2STR(XEN_VERSION) "." INT2STR(XEN_SUBVERSION) \ > + XEN_EXTRAVERSION > +#define XEN_STR_COMPILED_BY XEN_COMPILE_BY "@" XEN_COMPILE_HOST "." \ > + XEN_COMPILE_DOMAIN > + > + size_t size > + sl + round_up(sizeof(XEN_STR_VERSION)) + > + sl + round_up(sizeof(XEN_CHANGESET)) + > + sl + round_up(sizeof(XEN_COMPILE_DATE)) + > + sl + round_up(sizeof(XEN_STR_COMPILED_BY)) + > + sl + round_up(sizeof(XEN_COMPILER)) + > + sl + round_up(strlen(saved_cmdline)+1); > + > + StringTab = xmalloc_bytes(size); > + if ( ! StringTab ) > + return -ENOMEM; > + > + StringTabSize = size; > + memset(StringTab, 0, size); > + > + ptr = StringTab; > + > +#define WRITE_STRTAB_ENTRY_CONST(v, s) \ > + *(unsigned long*)ptr = (v); ptr += sl; \ > + memcpy(ptr, s, sizeof(s)); ptr += round_up(sizeof(s)) > + > +#define WRITE_STRTAB_ENTRY_VAR(v, s) \ > + *(unsigned long*)ptr = (v); ptr += sl; \ > + memcpy(ptr, s, strlen(s)); ptr += round_up(strlen(s)+1) > + > + WRITE_STRTAB_ENTRY_CONST(XEN_STRINGTAB_VERSION, XEN_STR_VERSION); > + WRITE_STRTAB_ENTRY_CONST(XEN_STRINGTAB_CSET, XEN_CHANGESET); > + WRITE_STRTAB_ENTRY_CONST(XEN_STRINGTAB_COMPILE_DATE, XEN_COMPILE_DATE); > + WRITE_STRTAB_ENTRY_CONST(XEN_STRINGTAB_COMPILED_BY, XEN_STR_COMPILED_BY); > + WRITE_STRTAB_ENTRY_CONST(XEN_STRINGTAB_COMPILER, XEN_COMPILER); > + WRITE_STRTAB_ENTRY_VAR(XEN_STRINGTAB_CMDLINE, saved_cmdline); > + > +#undef WRITE_STRTAB_ENTRY_VAR > +#undef WRITE_STRTAB_ENTRY_CONST > +#undef XEN_STR_COMPILED_BY > +#undef XEN_STR_VERSION > +#undef INT2STR > +#undef STRING > + > + return 0; > +} > + > +static void write_val64tab(void) > +{ > + int i; > + for ( i=0; i<ARRAY_SIZE(ValTab); ++i ) > + { > + switch( ValTab[i].id ) > + { > + case XEN_VALTAB_TOP_OF_MEM: > + ValTab[i].val = (uint64_t)top_of_mem; > + break; > + case XEN_VALTAB_CONRING_SIZE: > + ValTab[i].val = (uint64_t)conring_size; > + break; > + default: > + printk(XENLOG_WARNING "Unrecognised ValTab ID 0x%016"PRIx64"\n", > + ValTab[i].id);Sorry, but this is a BUG() or an ASSERT(), but not a warning.> + break; > + } > + } > +} > + > +static void write_sym64tab(void) > +{ > + int i; > + for ( i=0; i<ARRAY_SIZE(SymTab); ++i ) > + { > + switch( SymTab[i].id ) > + { > + case XEN_SYMTAB_CONRING: > + SymTab[i].addr = (uint64_t)__pa(conring); > + break; > + default: > + printk(XENLOG_WARNING "Unrecognised SymTab ID 0x%016"PRIx64"\n", > + SymTab[i].id);Same here.> + break; > + } > + } > +} > + > /* Set up the single Xen-specific-info crash note. */ > crash_xen_info_t *kexec_crash_save_info(void) > { > int cpu = smp_processor_id(); > crash_xen_info_t info; > crash_xen_info_t *out = (crash_xen_info_t *)ELFNOTE_DESC(xen_crash_note); > + char * details; > > BUG_ON(!cpumask_test_and_set_cpu(cpu, &crash_saved_cpus)); > > @@ -284,6 +406,17 @@ crash_xen_info_t *kexec_crash_save_info( > /* Copy from guaranteed-aligned local copy to possibly-unaligned dest. */ > memcpy(out, &info, sizeof(info)); > > + details = ELFNOTE_DESC(xen_stringtab); > + memcpy(details, StringTab, StringTabSize); > + > + details = ELFNOTE_DESC(xen_valtab); > + write_val64tab(); > + memcpy(details, ValTab, sizeof(ValTab)); > + > + details = ELFNOTE_DESC(xen_symtab); > + write_sym64tab(); > + memcpy(details, SymTab, sizeof(SymTab)); > + > return out; > } > > @@ -365,7 +498,10 @@ static size_t sizeof_cpu_notes(const uns > /* CPU0 also presents the crash_xen_info note. */ > if ( ! cpu ) > bytes = bytes + > - sizeof_note("Xen", sizeof(crash_xen_info_t)); > + sizeof_note("Xen", sizeof(crash_xen_info_t)) + > + sizeof_note("Xen", StringTabSize) + > + sizeof_note("Xen", sizeof(ValTab)) + > + sizeof_note("Xen", sizeof(SymTab)); > > return bytes; > } > @@ -449,6 +585,21 @@ static int kexec_init_cpu_notes(const un > xen_crash_note = note = ELFNOTE_NEXT(note); > setup_note(note, "Xen", XEN_ELFNOTE_CRASH_INFO, > sizeof(crash_xen_info_t)); > + > + /* Set up Xen StringTab note. */ > + xen_stringtab = note = ELFNOTE_NEXT(note); > + setup_note(note, "Xen", XEN_CRASHTABLE_STRINGTAB, > + StringTabSize); > + > + /* Set up Xen ValTab note. */ > + xen_valtab = note = ELFNOTE_NEXT(note); > + setup_note(note, "Xen", XEN_CRASHTABLE_VAL64TAB, > + sizeof(ValTab)); > + > + /* Set up Xen SymTab note. */ > + xen_symtab = note = ELFNOTE_NEXT(note); > + setup_note(note, "Xen", XEN_CRASHTABLE_SYM64TAB, > + sizeof(SymTab)); > } > } > } > @@ -505,6 +656,9 @@ static int __init kexec_init(void) > > register_keyhandler(''C'', &crashdump_trigger_keyhandler); > > + if ( setup_stringtable() ) > + return -ENOMEM;Given that this is an extension, I don''t think failing to set this up should be fatal in any way. Just don''t write those new notes when you can''t set them up.> + > if ( low_crashinfo_mode > LOW_CRASHINFO_NONE ) > { > size_t crash_heap_size; > --- a/xen/drivers/char/console.c > +++ b/xen/drivers/char/console.c > @@ -59,8 +59,8 @@ size_param("conring_size", opt_conring_s > #define _CONRING_SIZE 16384 > #define CONRING_IDX_MASK(i) ((i)&(conring_size-1)) > static char __initdata _conring[_CONRING_SIZE]; > -static char *__read_mostly conring = _conring; > -static uint32_t __read_mostly conring_size = _CONRING_SIZE; > +char *__read_mostly conring = _conring; > +uint32_t __read_mostly conring_size = _CONRING_SIZE;Please don''t. If anything, add a call into this file to have the note fields filled.> static uint32_t conringc, conringp; > > static int __read_mostly sercon_handle = -1; > --- /dev/null > +++ b/xen/include/public/crashtables.h > @@ -0,0 +1,77 @@ > +/****************************************************************************** > + * crashtables.h > + * > + * Definitions used for the Xen ELF crash notes. > + * > + * Permission is hereby granted, free of charge, to any person obtaining a copy > + * of this software and associated documentation files (the "Software"), to > + * deal in the Software without restriction, including without limitation the > + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or > + * sell copies of the Software, and to permit persons to whom the Software is > + * furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > + * DEALINGS IN THE SOFTWARE. > + * > + * Copyright (c) 2011,2012 Andrew Cooper, Citrix Ltd. > + */ > + > +#ifndef __XEN_PUBLIC_CRASHTABLES_H__ > +#define __XEN_PUBLIC_CRASHTABLES_H__ > + > +#include "xen.h" > + > +/* > + * Xen string crash table. > + * Note is a list in the form of: > + * Machine word (32/64bit) String ID. > + * Null terminating string, 0-extended to word alignment. > + */ > +#define XEN_CRASHTABLE_STRINGTAB 0x3000000The number looks as if it wasn''t picked arbitrarily, so a comment should say what it was derived from. Or if it is truly (and compatibly) arbitrary, imo this should be stated here too.> + > +#define XEN_STRINGTAB_VERSION 0 > +#define XEN_STRINGTAB_CSET 1 > +#define XEN_STRINGTAB_COMPILE_DATE 2 > +#define XEN_STRINGTAB_COMPILED_BY 3 > +#define XEN_STRINGTAB_COMPILER 4 > +#define XEN_STRINGTAB_CMDLINE 5 > + > +/* > + * Xen value crash table. > + * Note is a list in the form of: > + * Machine word (32/64bit) Value ID. > + * 64bit value. > + */ > +#define XEN_CRASHTABLE_VAL64TAB 0x3000001 > + > +typedef struct { unsigned long id; uint64_t val; } val64tab_t;You''re saying you''re doing this to improve 32-/64-bit interaction. How is using ''unsigned long'' here (and below) fitting into this picture?> + > +#define XEN_VALTAB_TOP_OF_MEM 0 > +#define XEN_VALTAB_CONRING_SIZE 1 > + > +/* > + * Xen symbol crash table > + * Note is a list in the form of: > + * Machine word (32/64bit) Symbol ID. > + * Symbol Address (64bit pointers). > + */ > +#define XEN_CRASHTABLE_SYM64TAB 0x3000002 > + > +typedef struct { unsigned long id; uint64_t addr; } sym64tab_t; > + > +#define XEN_SYMTAB_CONRING 0 > + > + > +#endif /* __XEN_PUBLIC_CRASHTABLES_H__ */ > + > +/* > +TODO: reintroduce local vars here. emacs doesn''t get on well with patch files claiming to be C files > + */??? Jan
Andrew Cooper
2012-Mar-09 16:24 UTC
Re: [PATCH 3 of 4] KEXEC: Allocate crash structures in low memory
On 09/03/12 15:52, Keir Fraser wrote:> On 09/03/2012 14:42, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote: > >> On 64bit Xen with 32bit dom0 and crashkernel, xmalloc''ing items such >> as the CPU crash notes will go into the xenheap, which tends to be in >> upper memory. This causes problems on machines with more than 64GB >> (or 4GB if no PAE support) of ram as the crashkernel physically cant >> access the crash notes. > Why wouldn''t you run a 64-bit crashkernel? > > -- KeirThere is certainly an argument to be made along those lines, and I personally lean to that side of the argument. However, as our dom0 kernel is 32bits, there is no substantial testing of the drivers built for 64bit. Given the number of driver errors we have to fix in 32bit mode, and the number of errors which only show in the kdump environment, the predominant view is that using a 64bit kdump kernel with 32bit dom0 is just asking for extra trouble in an area which is not easy to test in the first place. As a result, we will not be updating to a 64bit kdump kernel until we update to a 64bit dom0 kernel. Along with that comes many performance considerations, meaning that it is not going to happen soon. ~Andrew>> The solution is to force Xen to allocate certain structures in lower >> memory. This is achieved by introducing two new command line >> parameters; low_crashinfo and crashinfo_maxaddr. Because of the >> potential impact on 32bit PV guests, and that this problem does not >> exist for 64bit dom0 on 64bit Xen, this new functionality defaults to >> the codebase''s previous behavior, requiring the user to explicitly >> add extra command line parameters to change the behavior. >> >> This patch consists of 3 logically distinct but closely related >> changes. >> 1) Add the two new command line parameters. >> 2) Change crash note allocation to use lower memory when instructed. >> 3) Change the conring buffer to use lower memory when instructed. >> >> There result is that the crash notes and console ring will be placed >> in lower memory so useful information can be recovered in the case of >> a crash. >> >> Changes since v1: >> - Patch xen-command-line.markdown to document new options >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> >> diff -r 1e79fb200722 -r 3669bd0ac6b9 docs/misc/xen-command-line.markdown >> --- a/docs/misc/xen-command-line.markdown >> +++ b/docs/misc/xen-command-line.markdown >> @@ -188,6 +188,13 @@ The optional trailing `x` indicates that >> ### cpuid\_mask\_xsave\_eax >> ### cpuidle >> ### cpuinfo >> +### crashinfo_maxaddr >> +> `= <size>` >> + >> +> Default: `4G` >> + >> +Specify the maximum address to allocate certain strucutres, if used in >> combination with the `low_crashinfo` command line option. >> + >> ### crashkernel >> ### credit2\_balance\_over >> ### credit2\_balance\_under >> @@ -273,6 +280,13 @@ Set the logging level for Xen. Any log >> >> The optional `<rate-limited level>` options instructs which severities should >> be rate limited. >> >> +### low\_crashinfo >> +> `= none | min | all` >> + >> +> Default: `none` if not specified at all, or to `min` if `low\_crashinfo` is >> present without qualification. >> + >> +This option is only useful for hosts with a 32bit dom0 kernel, wishing to use >> kexec functionality in the case of a crash. It represents which data >> structures should be deliberatly allocated in low memory, so the crash kernel >> may find find them. Should be used in combination with `crashinfo_maxaddr`. >> + >> ### max\_cstate >> ### max\_gsi\_irqs >> ### maxcpus >> diff -r 1e79fb200722 -r 3669bd0ac6b9 xen/arch/x86/setup.c >> --- a/xen/arch/x86/setup.c >> +++ b/xen/arch/x86/setup.c >> @@ -586,6 +586,10 @@ void __init __start_xen(unsigned long mb >> } >> cmdline_parse(cmdline); >> >> + /* Must be after command line argument parsing and before >> + * allocing any xenheap structures wanted in lower memory. */ >> + kexec_early_calculations(); >> + >> parse_video_info(); >> >> set_current((struct vcpu *)0xfffff000); /* debug sanity */ >> diff -r 1e79fb200722 -r 3669bd0ac6b9 xen/common/kexec.c >> --- a/xen/common/kexec.c >> +++ b/xen/common/kexec.c >> @@ -36,7 +36,9 @@ bool_t kexecing = FALSE; >> typedef struct { Elf_Note * start; size_t size; } crash_note_range_t; >> static crash_note_range_t * crash_notes; >> >> -/* Lock to prevent race conditions when allocating the crash note buffers. */ >> +/* Lock to prevent race conditions when allocating the crash note buffers. >> + * It also serves to protect calls to alloc_from_crash_heap when allocating >> + * crash note buffers in lower memory. */ >> static DEFINE_SPINLOCK(crash_notes_lock); >> >> static Elf_Note *xen_crash_note; >> @@ -62,6 +64,24 @@ static struct { >> unsigned long size; >> } ranges[16] __initdata; >> >> +/* Low crashinfo mode. Start as INVALID so serveral codepaths can set up >> + * defaults without needing to know the state of the others. */ >> +enum low_crashinfo low_crashinfo_mode = LOW_CRASHINFO_INVALID; >> + >> +/* This value is only considered if low_crash_mode is set to MIN or ALL, so >> + * setting a default here is safe. Default to 4GB. This is because the >> current >> + * KEXEC_CMD_get_range compat hypercall trucates 64bit pointers to 32 bits. >> The >> + * typical usecase for crashinfo_maxaddr will be for 64bit Xen with 32bit >> dom0 >> + * and 32bit crash kernel. */ >> +static paddr_t __initdata crashinfo_maxaddr = 4ULL << 30; >> + >> +/* = log base 2 of crashinfo_maxaddr after checking for sanity. Default to >> + * larger than the entire physical address space. */ >> +paddr_t crashinfo_maxaddr_bits = 64; >> + >> +/* Pointers to keep track of the crash heap region. */ >> +static void *crash_heap_current = NULL, *crash_heap_end = NULL; >> + >> /* >> * Parse command lines in the format >> * >> @@ -140,6 +160,58 @@ static void __init parse_crashkernel(con >> } >> custom_param("crashkernel", parse_crashkernel); >> >> +/* Parse command lines in the format: >> + * >> + * low_crashinfo=[none,min,all] >> + * >> + * - none disables the low allocation of crash info. >> + * - min will allocate enough low information for the crash kernel to be able >> + * to extract the hypervisor and dom0 message ring buffers. >> + * - all will allocate additional structures such as domain and vcpu structs >> + * low so the crash kernel can perform an extended analysis of state. >> + */ >> +static void __init parse_low_crashinfo(const char * str) >> +{ >> + >> + if ( !strlen(str) ) >> + /* default to min if user just specifies "low_crashinfo" */ >> + low_crashinfo_mode = LOW_CRASHINFO_MIN; >> + else if ( !strcmp(str, "none" ) ) >> + low_crashinfo_mode = LOW_CRASHINFO_NONE; >> + else if ( !strcmp(str, "min" ) ) >> + low_crashinfo_mode = LOW_CRASHINFO_MIN; >> + else if ( !strcmp(str, "all" ) ) >> + low_crashinfo_mode = LOW_CRASHINFO_ALL; >> + else >> + { >> + printk("Unknown low_crashinfo parameter ''%s''. Defaulting to min.\n", >> str); >> + low_crashinfo_mode = LOW_CRASHINFO_MIN; >> + } >> +} >> +custom_param("low_crashinfo", parse_low_crashinfo); >> + >> +/* Parse command lines in the format: >> + * >> + * crashinfo_maxaddr=<addr> >> + * >> + * <addr> will be rounded down to the nearest power of two. Defaults to 64G >> + */ >> +static void __init parse_crashinfo_maxaddr(const char * str) >> +{ >> + u64 addr; >> + >> + /* if low_crashinfo_mode is unset, default to min. */ >> + if ( low_crashinfo_mode == LOW_CRASHINFO_INVALID ) >> + low_crashinfo_mode = LOW_CRASHINFO_MIN; >> + >> + if ( (addr = parse_size_and_unit(str, NULL)) ) >> + crashinfo_maxaddr = addr; >> + else >> + printk("Unable to parse crashinfo_maxaddr. Defaulting to %p\n", >> + (void*)crashinfo_maxaddr); >> +} >> +custom_param("crashinfo_maxaddr", parse_crashinfo_maxaddr); >> + >> void __init set_kexec_crash_area_size(u64 system_ram) >> { >> unsigned int idx; >> @@ -298,6 +370,20 @@ static size_t sizeof_cpu_notes(const uns >> return bytes; >> } >> >> +/* Allocate size_t bytes of space from the previously allocated >> + * crash heap if the user has requested that crash notes be allocated >> + * in lower memory. There is currently no case where the crash notes >> + * should be free()''d. */ >> +static void * alloc_from_crash_heap(const size_t bytes) >> +{ >> + void * ret; >> + if ( crash_heap_current + bytes > crash_heap_end ) >> + return NULL; >> + ret = (void*)crash_heap_current; >> + crash_heap_current += bytes; >> + return ret; >> +} >> + >> /* Allocate a crash note buffer for a newly onlined cpu. */ >> static int kexec_init_cpu_notes(const unsigned long cpu) >> { >> @@ -312,7 +398,10 @@ static int kexec_init_cpu_notes(const un >> return ret; >> >> nr_bytes = sizeof_cpu_notes(cpu); >> - note = xmalloc_bytes(nr_bytes); >> + >> + /* If we dont care about the position of allocation, malloc. */ >> + if ( low_crashinfo_mode == LOW_CRASHINFO_NONE ) >> + note = xmalloc_bytes(nr_bytes); >> >> /* Protect the write into crash_notes[] with a spinlock, as this function >> * is on a hotplug path and a hypercall path. */ >> @@ -330,6 +419,11 @@ static int kexec_init_cpu_notes(const un >> } >> else >> { >> + /* If we care about memory possition, alloc from the crash heap, >> + * also protected by the crash_notes_lock. */ >> + if ( low_crashinfo_mode > LOW_CRASHINFO_NONE ) >> + note = alloc_from_crash_heap(nr_bytes); >> + >> crash_notes[cpu].start = note; >> crash_notes[cpu].size = nr_bytes; >> spin_unlock(&crash_notes_lock); >> @@ -389,6 +483,18 @@ static struct notifier_block cpu_nfb = { >> .notifier_call = cpu_callback >> }; >> >> +void __init kexec_early_calculations(void) >> +{ >> + /* If low_crashinfo_mode is still INVALID, neither "low_crashinfo" nor >> + * "crashinfo_maxaddr" have been specified on the command line, so >> + * explicitly set to NONE. */ >> + if ( low_crashinfo_mode == LOW_CRASHINFO_INVALID ) >> + low_crashinfo_mode = LOW_CRASHINFO_NONE; >> + >> + if ( low_crashinfo_mode > LOW_CRASHINFO_NONE ) >> + crashinfo_maxaddr_bits = fls(crashinfo_maxaddr) - 1; >> +} >> + >> static int __init kexec_init(void) >> { >> void *cpu = (void *)(unsigned long)smp_processor_id(); >> @@ -399,6 +505,29 @@ static int __init kexec_init(void) >> >> register_keyhandler(''C'', &crashdump_trigger_keyhandler); >> >> + if ( low_crashinfo_mode > LOW_CRASHINFO_NONE ) >> + { >> + size_t crash_heap_size; >> + >> + /* This calculation is safe even if the machine is booted in >> + * uniprocessor mode. */ >> + crash_heap_size = sizeof_cpu_notes(0) + >> + sizeof_cpu_notes(1) * (nr_cpu_ids - 1); >> + crash_heap_size = PAGE_ALIGN(crash_heap_size); >> + >> + crash_heap_current = alloc_xenheap_pages( >> + get_order_from_bytes(crash_heap_size), >> + MEMF_bits(crashinfo_maxaddr_bits) ); >> + >> + if ( ! crash_heap_current ) >> + return -ENOMEM; >> + >> + crash_heap_end = crash_heap_current + crash_heap_size; >> + } >> + >> + /* crash_notes may be allocated anywhere Xen can reach in memory. >> + Only the individual CPU crash notes themselves must be allocated >> + in lower memory if requested. */ >> crash_notes = xmalloc_array(crash_note_range_t, nr_cpu_ids); >> if ( ! crash_notes ) >> return -ENOMEM; >> diff -r 1e79fb200722 -r 3669bd0ac6b9 xen/drivers/char/console.c >> --- a/xen/drivers/char/console.c >> +++ b/xen/drivers/char/console.c >> @@ -596,7 +596,7 @@ void __init console_init_postirq(void) >> opt_conring_size = num_present_cpus() << (9 + xenlog_lower_thresh); >> >> order = get_order_from_bytes(max(opt_conring_size, conring_size)); >> - while ( (ring = alloc_xenheap_pages(order, 0)) == NULL ) >> + while ( (ring = alloc_xenheap_pages(order, >> MEMF_bits(crashinfo_maxaddr_bits))) == NULL ) >> { >> BUG_ON(order == 0); >> order--; >> diff -r 1e79fb200722 -r 3669bd0ac6b9 xen/include/xen/kexec.h >> --- a/xen/include/xen/kexec.h >> +++ b/xen/include/xen/kexec.h >> @@ -25,6 +25,19 @@ void set_kexec_crash_area_size(u64 syste >> #define KEXEC_IMAGE_CRASH_BASE 2 >> #define KEXEC_IMAGE_NR 4 >> >> +enum low_crashinfo { >> + LOW_CRASHINFO_INVALID = 0, >> + LOW_CRASHINFO_NONE = 1, >> + LOW_CRASHINFO_MIN = 2, >> + LOW_CRASHINFO_ALL = 3 >> +}; >> + >> +/* Low crashinfo mode. Start as INVALID so serveral codepaths can set up >> + * defaults without needing to know the state of the others. */ >> +extern enum low_crashinfo low_crashinfo_mode; >> +extern paddr_t crashinfo_maxaddr_bits; >> +void kexec_early_calculations(void); >> + >> int machine_kexec_load(int type, int slot, xen_kexec_image_t *image); >> void machine_kexec_unload(int type, int slot, xen_kexec_image_t *image); >> void machine_kexec_reserved(xen_kexec_reserve_t *reservation); >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel >-- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com
Keir Fraser
2012-Mar-09 16:30 UTC
Re: [PATCH 3 of 4] KEXEC: Allocate crash structures in low memory
On 09/03/2012 16:24, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:> On 09/03/12 15:52, Keir Fraser wrote: >> On 09/03/2012 14:42, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote: >> >>> On 64bit Xen with 32bit dom0 and crashkernel, xmalloc''ing items such >>> as the CPU crash notes will go into the xenheap, which tends to be in >>> upper memory. This causes problems on machines with more than 64GB >>> (or 4GB if no PAE support) of ram as the crashkernel physically cant >>> access the crash notes. >> Why wouldn''t you run a 64-bit crashkernel? >> >> -- Keir > > There is certainly an argument to be made along those lines, and I > personally lean to that side of the argument. However, as our dom0 > kernel is 32bits, there is no substantial testing of the drivers built > for 64bit.Yes, I suppose this sounds pretty reasonable. -- Keir
On 09/03/12 15:43, Jan Beulich wrote:>>>> On 09.03.12 at 15:42, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> It is very useful for the crashdump kernel to have an idea of what Xen >> thought was the maximum memory address was, especially if the >> crashdump kernel is 32bit. > But then you also need to update the value upon memory hotplug. > (And I assume subsequent patches will actually make use of this.) > > JanWhere/how does memory hotplug get exposed then? A cursory look in mm.c has references to hotplug in the init paths. A subsequent patch does make use of this value. ~Andrew>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> >> diff -r 977fa1c61b3e -r 1e79fb200722 xen/arch/x86/mm.c >> --- a/xen/arch/x86/mm.c >> +++ b/xen/arch/x86/mm.c >> @@ -149,6 +149,7 @@ struct domain *dom_xen, *dom_io, *dom_co >> /* Frame table size in pages. */ >> unsigned long max_page; >> unsigned long total_pages; >> +u64 top_of_mem; >> >> unsigned long __read_mostly pdx_group_valid[BITS_TO_LONGS( >> (FRAMETABLE_SIZE / sizeof(*frame_table) + PDX_GROUP_COUNT - 1) >> diff -r 977fa1c61b3e -r 1e79fb200722 xen/arch/x86/setup.c >> --- a/xen/arch/x86/setup.c >> +++ b/xen/arch/x86/setup.c >> @@ -1111,6 +1111,7 @@ void __init __start_xen(unsigned long mb >> nr_pages >> (20 - PAGE_SHIFT), >> nr_pages << (PAGE_SHIFT - 10)); >> total_pages = nr_pages; >> + top_of_mem = e820.map[e820.nr_map-1].addr + e820.map[e820.nr_map-1].size; >> >> /* Sanity check for unwanted bloat of certain hypercall structures. */ >> BUILD_BUG_ON(sizeof(((struct xen_platform_op *)0)->u) !>> diff -r 977fa1c61b3e -r 1e79fb200722 xen/include/asm-x86/mm.h >> --- a/xen/include/asm-x86/mm.h >> +++ b/xen/include/asm-x86/mm.h >> @@ -296,6 +296,7 @@ int get_superpage(unsigned long mfn, str >> #endif >> extern unsigned long max_page; >> extern unsigned long total_pages; >> +extern u64 top_of_mem; >> void init_frametable(void); >> >> #define PDX_GROUP_COUNT ((1 << L2_PAGETABLE_SHIFT) / \ >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel > >-- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com
>>> On 09.03.12 at 17:08, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > Doh! I started this patch against unstable, with xzalloc, then ported to > 4.1 without xzalloc, then ported back to unstable and forgot to change.And meanwhile 4.1 has xzalloc, btw. Jan
>>> On 09.03.12 at 17:46, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > On 09/03/12 15:43, Jan Beulich wrote: >>>>> On 09.03.12 at 15:42, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>> It is very useful for the crashdump kernel to have an idea of what Xen >>> thought was the maximum memory address was, especially if the >>> crashdump kernel is 32bit. >> But then you also need to update the value upon memory hotplug. >> (And I assume subsequent patches will actually make use of this.) > > Where/how does memory hotplug get exposed then? A cursory look in mm.c > has references to hotplug in the init paths.xen/arch/x86/x86_64/mm.c:memory_add() & Co. Jan
Andrew Cooper
2012-Mar-09 17:01 UTC
Re: [PATCH 3 of 4] KEXEC: Allocate crash structures in low memory
On 09/03/12 15:55, Jan Beulich wrote:>>>> On 09.03.12 at 15:42, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> --- a/xen/drivers/char/console.c >> +++ b/xen/drivers/char/console.c >> @@ -596,7 +596,7 @@ void __init console_init_postirq(void) >> opt_conring_size = num_present_cpus() << (9 + xenlog_lower_thresh); >> >> order = get_order_from_bytes(max(opt_conring_size, conring_size)); >> - while ( (ring = alloc_xenheap_pages(order, 0)) == NULL ) >> + while ( (ring = alloc_xenheap_pages(order, MEMF_bits(crashinfo_maxaddr_bits))) == NULL ) >> { >> BUG_ON(order == 0); >> order--; > Did you note the loop here? Rather than shrinking the size, I think > you ought to ignore the address limit as the first fallback option, > in particular when the size was specified on the command line. > > JanYes, but I disagree. The default for crashinfo_maxaddr_bits is 64, so most of the time, this change will have no functional effect. The times when crashinfo_maxaddr_bits will change is either when the user specifies "low_crashinfo=min/all" explicitly on the command line, or implicitly specifies "low_crashinfo=min" by explicitly specifying "crashinfo_maxaddr=$foo" on the command line. At this point, I would say that the users request to have the console ring in low memory overrides the size argument. The usecase for low_crashinfo is with a 32bit crash kernel, at which point a smaller console ring in lower memory is preferable to a larger console ring that the crashdump kernel cant access. If order gets to 0, one solution might be to then drop the location restriction and try unrestricted from size again. However, the point at which you cant allocate a single page in lower memory is the point at which you have bigger problems to worry about. -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com
Andrew Cooper
2012-Mar-09 17:33 UTC
Re: [PATCH 4 of 4] KEXEC: Introduce new crashtables interface
On 09/03/12 16:11, Jan Beulich wrote:>>>> On 09.03.12 at 15:42, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> --- a/xen/common/kexec.c >> +++ b/xen/common/kexec.c >> @@ -82,6 +85,21 @@ paddr_t crashinfo_maxaddr_bits = 64; >> /* Pointers to keep track of the crash heap region. */ >> static void *crash_heap_current = NULL, *crash_heap_end = NULL; >> >> +char * StringTab = NULL; >> +size_t StringTabSize = 0; > static? Naming convention?Yes>> + >> +static val64tab_t ValTab[] > const? Naming convention? >> +{ >> + { XEN_VALTAB_TOP_OF_MEM, 0 }, >> + { XEN_VALTAB_CONRING_SIZE, 0 } >> +}; >> + >> +static sym64tab_t SymTab[] > const? Naming convention? > >> +{ >> + { XEN_SYMTAB_CONRING, 0 } >> +}; >> + >> + >> /* >> * Parse command lines in the format >> * >> @@ -262,12 +280,116 @@ void kexec_crash_save_cpu(void) >> elf_core_save_regs(&prstatus->pr_reg, xencore); >> } >> >> +/* Round val up to a machine word alignment */ >> +size_t round_up(size_t val) >> +{ >> + return ((val + (sizeof(unsigned long)-1)) & >> + ~((sizeof (unsigned long))-1)); >> +} >> + >> + >> +extern xen_commandline_t saved_cmdline; >> +static int setup_stringtable(void) >> +{ >> + char * ptr; >> + size_t sl = sizeof(unsigned long); >> + >> +#define STRING(x) #x >> +#define INT2STR(x) STRING(x) >> + >> +#define XEN_STR_VERSION INT2STR(XEN_VERSION) "." INT2STR(XEN_SUBVERSION) \ >> + XEN_EXTRAVERSION >> +#define XEN_STR_COMPILED_BY XEN_COMPILE_BY "@" XEN_COMPILE_HOST "." \ >> + XEN_COMPILE_DOMAIN >> + >> + size_t size >> + sl + round_up(sizeof(XEN_STR_VERSION)) + >> + sl + round_up(sizeof(XEN_CHANGESET)) + >> + sl + round_up(sizeof(XEN_COMPILE_DATE)) + >> + sl + round_up(sizeof(XEN_STR_COMPILED_BY)) + >> + sl + round_up(sizeof(XEN_COMPILER)) + >> + sl + round_up(strlen(saved_cmdline)+1); >> + >> + StringTab = xmalloc_bytes(size); >> + if ( ! StringTab ) >> + return -ENOMEM; >> + >> + StringTabSize = size; >> + memset(StringTab, 0, size); >> + >> + ptr = StringTab; >> + >> +#define WRITE_STRTAB_ENTRY_CONST(v, s) \ >> + *(unsigned long*)ptr = (v); ptr += sl; \ >> + memcpy(ptr, s, sizeof(s)); ptr += round_up(sizeof(s)) >> + >> +#define WRITE_STRTAB_ENTRY_VAR(v, s) \ >> + *(unsigned long*)ptr = (v); ptr += sl; \ >> + memcpy(ptr, s, strlen(s)); ptr += round_up(strlen(s)+1) >> + >> + WRITE_STRTAB_ENTRY_CONST(XEN_STRINGTAB_VERSION, XEN_STR_VERSION); >> + WRITE_STRTAB_ENTRY_CONST(XEN_STRINGTAB_CSET, XEN_CHANGESET); >> + WRITE_STRTAB_ENTRY_CONST(XEN_STRINGTAB_COMPILE_DATE, XEN_COMPILE_DATE); >> + WRITE_STRTAB_ENTRY_CONST(XEN_STRINGTAB_COMPILED_BY, XEN_STR_COMPILED_BY); >> + WRITE_STRTAB_ENTRY_CONST(XEN_STRINGTAB_COMPILER, XEN_COMPILER); >> + WRITE_STRTAB_ENTRY_VAR(XEN_STRINGTAB_CMDLINE, saved_cmdline); >> + >> +#undef WRITE_STRTAB_ENTRY_VAR >> +#undef WRITE_STRTAB_ENTRY_CONST >> +#undef XEN_STR_COMPILED_BY >> +#undef XEN_STR_VERSION >> +#undef INT2STR >> +#undef STRING >> + >> + return 0; >> +} >> + >> +static void write_val64tab(void) >> +{ >> + int i; >> + for ( i=0; i<ARRAY_SIZE(ValTab); ++i ) >> + { >> + switch( ValTab[i].id ) >> + { >> + case XEN_VALTAB_TOP_OF_MEM: >> + ValTab[i].val = (uint64_t)top_of_mem; >> + break; >> + case XEN_VALTAB_CONRING_SIZE: >> + ValTab[i].val = (uint64_t)conring_size; >> + break; >> + default: >> + printk(XENLOG_WARNING "Unrecognised ValTab ID 0x%016"PRIx64"\n", >> + ValTab[i].id); > Sorry, but this is a BUG() or an ASSERT(), but not a warning.ASSERT() it is then. Having a bug at this point stalling the crash path would just be silly, and the crash kernel does have to deal with the possibility that all of Xen memory is completely corrupted.>> + break; >> + } >> + } >> +} >> + >> +static void write_sym64tab(void) >> +{ >> + int i; >> + for ( i=0; i<ARRAY_SIZE(SymTab); ++i ) >> + { >> + switch( SymTab[i].id ) >> + { >> + case XEN_SYMTAB_CONRING: >> + SymTab[i].addr = (uint64_t)__pa(conring); >> + break; >> + default: >> + printk(XENLOG_WARNING "Unrecognised SymTab ID 0x%016"PRIx64"\n", >> + SymTab[i].id); > Same here. > >> + break; >> + } >> + } >> +} >> + >> /* Set up the single Xen-specific-info crash note. */ >> crash_xen_info_t *kexec_crash_save_info(void) >> { >> int cpu = smp_processor_id(); >> crash_xen_info_t info; >> crash_xen_info_t *out = (crash_xen_info_t *)ELFNOTE_DESC(xen_crash_note); >> + char * details; >> >> BUG_ON(!cpumask_test_and_set_cpu(cpu, &crash_saved_cpus)); >> >> @@ -284,6 +406,17 @@ crash_xen_info_t *kexec_crash_save_info( >> /* Copy from guaranteed-aligned local copy to possibly-unaligned dest. */ >> memcpy(out, &info, sizeof(info)); >> >> + details = ELFNOTE_DESC(xen_stringtab); >> + memcpy(details, StringTab, StringTabSize); >> + >> + details = ELFNOTE_DESC(xen_valtab); >> + write_val64tab(); >> + memcpy(details, ValTab, sizeof(ValTab)); >> + >> + details = ELFNOTE_DESC(xen_symtab); >> + write_sym64tab(); >> + memcpy(details, SymTab, sizeof(SymTab)); >> + >> return out; >> } >> >> @@ -365,7 +498,10 @@ static size_t sizeof_cpu_notes(const uns >> /* CPU0 also presents the crash_xen_info note. */ >> if ( ! cpu ) >> bytes = bytes + >> - sizeof_note("Xen", sizeof(crash_xen_info_t)); >> + sizeof_note("Xen", sizeof(crash_xen_info_t)) + >> + sizeof_note("Xen", StringTabSize) + >> + sizeof_note("Xen", sizeof(ValTab)) + >> + sizeof_note("Xen", sizeof(SymTab)); >> >> return bytes; >> } >> @@ -449,6 +585,21 @@ static int kexec_init_cpu_notes(const un >> xen_crash_note = note = ELFNOTE_NEXT(note); >> setup_note(note, "Xen", XEN_ELFNOTE_CRASH_INFO, >> sizeof(crash_xen_info_t)); >> + >> + /* Set up Xen StringTab note. */ >> + xen_stringtab = note = ELFNOTE_NEXT(note); >> + setup_note(note, "Xen", XEN_CRASHTABLE_STRINGTAB, >> + StringTabSize); >> + >> + /* Set up Xen ValTab note. */ >> + xen_valtab = note = ELFNOTE_NEXT(note); >> + setup_note(note, "Xen", XEN_CRASHTABLE_VAL64TAB, >> + sizeof(ValTab)); >> + >> + /* Set up Xen SymTab note. */ >> + xen_symtab = note = ELFNOTE_NEXT(note); >> + setup_note(note, "Xen", XEN_CRASHTABLE_SYM64TAB, >> + sizeof(SymTab)); >> } >> } >> } >> @@ -505,6 +656,9 @@ static int __init kexec_init(void) >> >> register_keyhandler(''C'', &crashdump_trigger_keyhandler); >> >> + if ( setup_stringtable() ) >> + return -ENOMEM; > Given that this is an extension, I don''t think failing to set this up > should be fatal in any way. Just don''t write those new notes when > you can''t set them up.Ok>> + >> if ( low_crashinfo_mode > LOW_CRASHINFO_NONE ) >> { >> size_t crash_heap_size; >> --- a/xen/drivers/char/console.c >> +++ b/xen/drivers/char/console.c >> @@ -59,8 +59,8 @@ size_param("conring_size", opt_conring_s >> #define _CONRING_SIZE 16384 >> #define CONRING_IDX_MASK(i) ((i)&(conring_size-1)) >> static char __initdata _conring[_CONRING_SIZE]; >> -static char *__read_mostly conring = _conring; >> -static uint32_t __read_mostly conring_size = _CONRING_SIZE; >> +char *__read_mostly conring = _conring; >> +uint32_t __read_mostly conring_size = _CONRING_SIZE; > Please don''t. If anything, add a call into this file to have the note > fields filled.Why not? Is this about the potential safety of making the symbols visible outside of console.o? Having functions around the codebase to call to fill in values is not scalable in terms of code; It would either require one function per intended entry into the crash table, or one function per .c file which then has to re-switch on the table id. Both of these options require that new entries for the crash table require new code in two files rather than just one.>> static uint32_t conringc, conringp; >> >> static int __read_mostly sercon_handle = -1; >> --- /dev/null >> +++ b/xen/include/public/crashtables.h >> @@ -0,0 +1,77 @@ >> +/****************************************************************************** >> + * crashtables.h >> + * >> + * Definitions used for the Xen ELF crash notes. >> + * >> + * Permission is hereby granted, free of charge, to any person obtaining a copy >> + * of this software and associated documentation files (the "Software"), to >> + * deal in the Software without restriction, including without limitation the >> + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or >> + * sell copies of the Software, and to permit persons to whom the Software is >> + * furnished to do so, subject to the following conditions: >> + * >> + * The above copyright notice and this permission notice shall be included in >> + * all copies or substantial portions of the Software. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE >> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING >> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER >> + * DEALINGS IN THE SOFTWARE. >> + * >> + * Copyright (c) 2011,2012 Andrew Cooper, Citrix Ltd. >> + */ >> + >> +#ifndef __XEN_PUBLIC_CRASHTABLES_H__ >> +#define __XEN_PUBLIC_CRASHTABLES_H__ >> + >> +#include "xen.h" >> + >> +/* >> + * Xen string crash table. >> + * Note is a list in the form of: >> + * Machine word (32/64bit) String ID. >> + * Null terminating string, 0-extended to word alignment. >> + */ >> +#define XEN_CRASHTABLE_STRINGTAB 0x3000000 > The number looks as if it wasn''t picked arbitrarily, so a comment should > say what it was derived from. Or if it is truly (and compatibly) arbitrary, > imo this should be stated here too.Ok>> + >> +#define XEN_STRINGTAB_VERSION 0 >> +#define XEN_STRINGTAB_CSET 1 >> +#define XEN_STRINGTAB_COMPILE_DATE 2 >> +#define XEN_STRINGTAB_COMPILED_BY 3 >> +#define XEN_STRINGTAB_COMPILER 4 >> +#define XEN_STRINGTAB_CMDLINE 5 >> + >> +/* >> + * Xen value crash table. >> + * Note is a list in the form of: >> + * Machine word (32/64bit) Value ID. >> + * 64bit value. >> + */ >> +#define XEN_CRASHTABLE_VAL64TAB 0x3000001 >> + >> +typedef struct { unsigned long id; uint64_t val; } val64tab_t; > You''re saying you''re doing this to improve 32-/64-bit interaction. How > is using ''unsigned long'' here (and below) fitting into this picture?You are correct. I have missed the final logical step (I think when merging my much finer grain development patches).>> + >> +#define XEN_VALTAB_TOP_OF_MEM 0 >> +#define XEN_VALTAB_CONRING_SIZE 1 >> + >> +/* >> + * Xen symbol crash table >> + * Note is a list in the form of: >> + * Machine word (32/64bit) Symbol ID. >> + * Symbol Address (64bit pointers). >> + */ >> +#define XEN_CRASHTABLE_SYM64TAB 0x3000002 >> + >> +typedef struct { unsigned long id; uint64_t addr; } sym64tab_t; >> + >> +#define XEN_SYMTAB_CONRING 0 >> + >> + >> +#endif /* __XEN_PUBLIC_CRASHTABLES_H__ */ >> + >> +/* >> +TODO: reintroduce local vars here. emacs doesn''t get on well with patch files claiming to be C files >> + */ > ??? > > JanDitto re. merging, although I realized this mistake right after the emails got sent. -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com
Jan Beulich
2012-Mar-11 09:59 UTC
Re: [PATCH 4 of 4] KEXEC: Introduce new crashtables interface
>>> Andrew Cooper <andrew.cooper3@citrix.com> 03/09/12 6:34 PM >>>On 09/03/12 16:11, Jan Beulich wrote:>>>> On 09.03.12 at 15:42, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>> --- a/xen/drivers/char/console.c >>> +++ b/xen/drivers/char/console.c >>> @@ -59,8 +59,8 @@ size_param("conring_size", opt_conring_s >>> #define _CONRING_SIZE 16384 >>> #define CONRING_IDX_MASK(i) ((i)&(conring_size-1)) >>> static char __initdata _conring[_CONRING_SIZE]; >>> -static char *__read_mostly conring = _conring; >>> -static uint32_t __read_mostly conring_size = _CONRING_SIZE; >>> +char *__read_mostly conring = _conring; >>> +uint32_t __read_mostly conring_size = _CONRING_SIZE; >> Please don''t. If anything, add a call into this file to have the note >> fields filled. > >Why not? Is this about the potential safety of making the symbols >visible outside of console.o?Yes.>Having functions around the codebase to call to fill in values is not >scalable in terms of code; It would either require one function per >intended entry into the crash table, or one function per .c file which >then has to re-switch on the table id.There shouldn''t be many. In particular I don''t buy that this method is significantly better than having the analysis tool use the symbol table. Jan