Keir Fraser
2011-Nov-30 09:05 UTC
Re: [PATCH 2/2] x86/microcode: enable boot time (pre-Dom0) loading
On 30/11/2011 16:27, "Jan Beulich" <JBeulich@suse.com> wrote:> > In order to not convert the spin_lock() in microcode_update_cpu() (and > then obviously also all other uses on microcode_mutex) to > spin_lock_irqsave() (which would be undesirable for the hypercall > context in which the function also runs), the boot time handling gets > done using a tasklet (instead of using on_selected_cpus()).Can you explain this some more? Why would the conversion to spin_lock_irqsave be required when spin_lock is sufficient for current usage from dom0 hypercall? -- Keir
Jan Beulich
2011-Nov-30 16:27 UTC
[PATCH 2/2] x86/microcode: enable boot time (pre-Dom0) loading
Largely as a result of the continuing resistance of Linux maintainers to accept a microcode loading patch for pv-ops Xen kernels, this follows the suggested route and provides a means to load microcode updates without the assistance of Dom0, thus also addressing eventual problems in the hardware much earlier. This leverages the fact that via the multiboot protocol another blob of data can be easily added in the form of just an extra module. Since microcode data cannot reliably be recognized by looking at the provided data, this requires (in the non-EFI case) the use of a command line parameter ("ucode=<number>") to identify which of the modules is to be parsed for an eventual microcode update (in the EFI case the module is being identified in the config file, and hence the command line argument, if given, will be ignored). This required to adjust the XSM module determination logic accordingly. The format of the data to be provided is the raw binary blob already used for AMD CPUs, and the output of the intel-microcode2ucode utility for the Intel case (either the per-(family,model,stepping) file or - to make things easier for distro-s integration-wise - simply the concatenation of all of them). In order to not convert the spin_lock() in microcode_update_cpu() (and then obviously also all other uses on microcode_mutex) to spin_lock_irqsave() (which would be undesirable for the hypercall context in which the function also runs), the boot time handling gets done using a tasklet (instead of using on_selected_cpus()). Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/efi/boot.c +++ b/xen/arch/x86/efi/boot.c @@ -49,6 +49,7 @@ static UINT32 __initdata mdesc_ver; static struct file __initdata cfg; static struct file __initdata kernel; static struct file __initdata ramdisk; +static struct file __initdata ucode; static struct file __initdata xsm; static multiboot_info_t __initdata mbi = { @@ -174,6 +175,8 @@ static void __init __attribute__((__nore efi_bs->FreePages(kernel.addr, PFN_UP(kernel.size)); if ( ramdisk.addr ) efi_bs->FreePages(ramdisk.addr, PFN_UP(ramdisk.size)); + if ( ucode.addr ) + efi_bs->FreePages(ucode.addr, PFN_UP(ucode.size)); if ( xsm.addr ) efi_bs->FreePages(xsm.addr, PFN_UP(xsm.size)); @@ -806,6 +809,17 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SY efi_bs->FreePool(name.w); } + name.s = get_value(&cfg, section.s, "ucode"); + if ( !name.s ) + name.s = get_value(&cfg, "global", "ucode"); + if ( name.s ) + { + microcode_set_module(mbi.mods_count); + split_value(name.s); + read_file(dir_handle, s2w(&name), &ucode); + efi_bs->FreePool(name.w); + } + name.s = get_value(&cfg, section.s, "xsm"); if ( name.s ) { --- a/xen/arch/x86/microcode.c +++ b/xen/arch/x86/microcode.c @@ -29,13 +29,49 @@ #include <xen/notifier.h> #include <xen/sched.h> #include <xen/smp.h> +#include <xen/softirq.h> #include <xen/spinlock.h> +#include <xen/tasklet.h> #include <xen/guest_access.h> #include <asm/msr.h> #include <asm/processor.h> +#include <asm/setup.h> #include <asm/microcode.h> +static module_t __initdata ucode_mod; +static void *(*__initdata ucode_mod_map)(const module_t *); +static unsigned int __initdata ucode_mod_idx; +static bool_t __initdata ucode_mod_forced; +static cpumask_t __initdata init_mask; + +void __init microcode_set_module(unsigned int idx) +{ + ucode_mod_idx = idx; + ucode_mod_forced = 1; +} + +static void __init parse_ucode(char *s) +{ + if ( !ucode_mod_forced ) + ucode_mod_idx = simple_strtoul(s, NULL, 0); +} +custom_param("ucode", parse_ucode); + +void __init microcode_grab_module( + unsigned long *module_map, + const multiboot_info_t *mbi, + void *(*map)(const module_t *)) +{ + module_t *mod = (module_t *)__va(mbi->mods_addr); + + if ( !ucode_mod_idx || ucode_mod_idx >= mbi->mods_count || + !__test_and_clear_bit(ucode_mod_idx, module_map) ) + return; + ucode_mod = mod[ucode_mod_idx]; + ucode_mod_map = map; +} + const struct microcode_ops *microcode_ops; static DEFINE_SPINLOCK(microcode_mutex); @@ -183,6 +219,41 @@ int microcode_update(XEN_GUEST_HANDLE(co return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info); } +static void __init _do_microcode_update(unsigned long data) +{ + microcode_update_cpu((void *)data, ucode_mod.mod_end); + cpumask_set_cpu(smp_processor_id(), &init_mask); +} + +static int __init microcode_init(void) +{ + void *data; + static struct tasklet __initdata tasklet; + unsigned int cpu; + + if ( !microcode_ops || !ucode_mod.mod_end ) + return 0; + + data = ucode_mod_map(&ucode_mod); + if ( !data ) + return -ENOMEM; + + softirq_tasklet_init(&tasklet, _do_microcode_update, (unsigned long)data); + + for_each_online_cpu ( cpu ) + { + tasklet_schedule_on_cpu(&tasklet, cpu); + do { + process_pending_softirqs(); + } while ( !cpumask_test_cpu(cpu, &init_mask) ); + } + + ucode_mod_map(NULL); + + return 0; +} +__initcall(microcode_init); + static int microcode_percpu_callback( struct notifier_block *nfb, unsigned long action, void *hcpu) { @@ -205,7 +276,20 @@ static struct notifier_block microcode_p static int __init microcode_presmp_init(void) { if ( microcode_ops ) + { + if ( ucode_mod.mod_end ) + { + void *data = ucode_mod_map(&ucode_mod); + + if ( data ) + microcode_update_cpu(data, ucode_mod.mod_end); + + ucode_mod_map(NULL); + } + register_cpu_notifier(µcode_percpu_nfb); + } + return 0; } presmp_initcall(microcode_presmp_init); --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -550,10 +550,10 @@ void __init __start_xen(unsigned long mb { char *memmap_type = NULL; char *cmdline, *kextra, *loader; - unsigned int initrdidx = 1; + unsigned int initrdidx; multiboot_info_t *mbi = __va(mbi_p); module_t *mod = (module_t *)__va(mbi->mods_addr); - unsigned long nr_pages, modules_headroom; + unsigned long nr_pages, modules_headroom, *module_map; int i, j, e820_warn = 0, bytes = 0; bool_t acpi_boot_table_init_done = 0; struct ns16550_defaults ns16550 = { @@ -1229,7 +1229,13 @@ void __init __start_xen(unsigned long mb init_IRQ(); - xsm_init(&initrdidx, mbi, bootstrap_map); + module_map = xmalloc_array(unsigned long, BITS_TO_LONGS(mbi->mods_count)); + bitmap_fill(module_map, mbi->mods_count); + __clear_bit(0, module_map); /* Dom0 kernel is always first */ + + xsm_init(module_map, mbi, bootstrap_map); + + microcode_grab_module(module_map, mbi, bootstrap_map); timer_init(); @@ -1356,6 +1362,12 @@ void __init __start_xen(unsigned long mb if ( xen_cpuidle ) xen_processor_pmbits |= XEN_PROCESSOR_PM_CX; + initrdidx = find_first_bit(module_map, mbi->mods_count); + if ( bitmap_weight(module_map, mbi->mods_count) > 1 ) + printk(XENLOG_WARNING + "Multiple initrd candidates, picking module #%u\n", + initrdidx); + /* * We''re going to setup domain0 using the module(s) that we stashed safely * above our heap. The second module, if present, is an initrd ramdisk. --- a/xen/include/asm-x86/processor.h +++ b/xen/include/asm-x86/processor.h @@ -598,6 +598,7 @@ int cpuid_hypervisor_leaves( uint32_t id int rdmsr_hypervisor_regs(uint32_t idx, uint64_t *val); int wrmsr_hypervisor_regs(uint32_t idx, uint64_t val); +void microcode_set_module(unsigned int); int microcode_update(XEN_GUEST_HANDLE(const_void), unsigned long len); int microcode_resume_cpu(int cpu); --- a/xen/include/asm-x86/setup.h +++ b/xen/include/asm-x86/setup.h @@ -44,4 +44,7 @@ void discard_initial_images(void); int xen_in_range(unsigned long mfn); void arch_get_xen_caps(xen_capabilities_info_t *info); +void microcode_grab_module( + unsigned long *, const multiboot_info_t *, void *(*)(const module_t *)); + #endif --- a/xen/include/xsm/xsm.h +++ b/xen/include/xsm/xsm.h @@ -454,14 +454,15 @@ static inline long __do_xsm_op (XEN_GUES } #ifdef XSM_ENABLE -extern int xsm_init(unsigned int *initrdidx, const multiboot_info_t *mbi, +extern int xsm_init(unsigned long *module_map, const multiboot_info_t *mbi, void *(*bootstrap_map)(const module_t *)); -extern int xsm_policy_init(unsigned int *initrdidx, const multiboot_info_t *mbi, +extern int xsm_policy_init(unsigned long *module_map, + const multiboot_info_t *mbi, void *(*bootstrap_map)(const module_t *)); extern int register_xsm(struct xsm_operations *ops); extern int unregister_xsm(struct xsm_operations *ops); #else -static inline int xsm_init (unsigned int *initrdidx, +static inline int xsm_init (unsigned long *module_map, const multiboot_info_t *mbi, void *(*bootstrap_map)(const module_t *)) { --- a/xen/xsm/xsm_core.c +++ b/xen/xsm/xsm_core.c @@ -43,7 +43,7 @@ static void __init do_xsm_initcalls(void } } -int __init xsm_init(unsigned int *initrdidx, const multiboot_info_t *mbi, +int __init xsm_init(unsigned long *module_map, const multiboot_info_t *mbi, void *(*bootstrap_map)(const module_t *)) { int ret = 0; @@ -52,7 +52,7 @@ int __init xsm_init(unsigned int *initrd if ( XSM_MAGIC ) { - ret = xsm_policy_init(initrdidx, mbi, bootstrap_map); + ret = xsm_policy_init(module_map, mbi, bootstrap_map); if ( ret ) { bootstrap_map(NULL); --- a/xen/xsm/xsm_policy.c +++ b/xen/xsm/xsm_policy.c @@ -20,11 +20,12 @@ #include <xsm/xsm.h> #include <xen/multiboot.h> +#include <asm/bitops.h> char *__initdata policy_buffer = NULL; u32 __initdata policy_size = 0; -int xsm_policy_init(unsigned int *initrdidx, const multiboot_info_t *mbi, +int xsm_policy_init(unsigned long *module_map, const multiboot_info_t *mbi, void *(*bootstrap_map)(const module_t *)) { int i; @@ -35,10 +36,13 @@ int xsm_policy_init(unsigned int *initrd /* * Try all modules and see whichever could be the binary policy. - * Adjust the initrdidx if module[1] is the binary policy. + * Adjust module_map for the module that is the binary policy. */ for ( i = mbi->mods_count-1; i >= 1; i-- ) { + if ( !test_bit(i, module_map) ) + continue; + _policy_start = bootstrap_map(mod + i); _policy_len = mod[i].mod_end; @@ -50,8 +54,7 @@ int xsm_policy_init(unsigned int *initrd printk("Policy len 0x%lx, start at %p.\n", _policy_len,_policy_start); - if ( i == 1 ) - *initrdidx = (mbi->mods_count > 2) ? 2 : 0; + __clear_bit(i, module_map); break; } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Nov-30 22:23 UTC
Re: [PATCH 2/2] x86/microcode: enable boot time (pre-Dom0) loading
On Wed, Nov 30, 2011 at 04:27:11PM +0000, Jan Beulich wrote:> Largely as a result of the continuing resistance of Linux maintainers > to accept a microcode loading patch for pv-ops Xen kernels, this > follows the suggested route and provides a means to load microcode > updates without the assistance of Dom0, thus also addressing eventual > problems in the hardware much earlier. > > This leverages the fact that via the multiboot protocol another blob > of data can be easily added in the form of just an extra module. Since > microcode data cannot reliably be recognized by looking at the > provided data, this requires (in the non-EFI case) the use of a > command line parameter ("ucode=<number>") to identify which of theWell, usually there would be two modules - the kernel (which we can identify) and the initramfs (which I guess one can also identify)? It seems that by process of elimination we could determine that the remaining module is the blob? Or would that be simple too dangerous to make such assumption?> modules is to be parsed for an eventual microcode update (in the EFI > case the module is being identified in the config file, and hence the > command line argument, if given, will be ignored). > > This required to adjust the XSM module determination logic accordingly. > > The format of the data to be provided is the raw binary blob already > used for AMD CPUs, and the output of the intel-microcode2ucode utility > for the Intel case (either the per-(family,model,stepping) file or - > to make things easier for distro-s integration-wise - simply the > concatenation of all of them).There was some talk by hpa and borislav of how they wanted the payload, but it never got finalized I think? Would it make sense to CC them on this to see how they are planning to implement it in GRUB2? I got the impression they wanted some new .pack format or so? Or is the format that they were talking about exactly what you picked?> > In order to not convert the spin_lock() in microcode_update_cpu() (and > then obviously also all other uses on microcode_mutex) to > spin_lock_irqsave() (which would be undesirable for the hypercall > context in which the function also runs), the boot time handling gets > done using a tasklet (instead of using on_selected_cpus()).Thanks for writing this.> > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > --- a/xen/arch/x86/efi/boot.c > +++ b/xen/arch/x86/efi/boot.c > @@ -49,6 +49,7 @@ static UINT32 __initdata mdesc_ver; > static struct file __initdata cfg; > static struct file __initdata kernel; > static struct file __initdata ramdisk; > +static struct file __initdata ucode; > static struct file __initdata xsm; > > static multiboot_info_t __initdata mbi = { > @@ -174,6 +175,8 @@ static void __init __attribute__((__nore > efi_bs->FreePages(kernel.addr, PFN_UP(kernel.size)); > if ( ramdisk.addr ) > efi_bs->FreePages(ramdisk.addr, PFN_UP(ramdisk.size)); > + if ( ucode.addr ) > + efi_bs->FreePages(ucode.addr, PFN_UP(ucode.size)); > if ( xsm.addr ) > efi_bs->FreePages(xsm.addr, PFN_UP(xsm.size)); > > @@ -806,6 +809,17 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SY > efi_bs->FreePool(name.w); > } > > + name.s = get_value(&cfg, section.s, "ucode"); > + if ( !name.s ) > + name.s = get_value(&cfg, "global", "ucode"); > + if ( name.s ) > + { > + microcode_set_module(mbi.mods_count); > + split_value(name.s); > + read_file(dir_handle, s2w(&name), &ucode); > + efi_bs->FreePool(name.w); > + } > + > name.s = get_value(&cfg, section.s, "xsm"); > if ( name.s ) > { > --- a/xen/arch/x86/microcode.c > +++ b/xen/arch/x86/microcode.c > @@ -29,13 +29,49 @@ > #include <xen/notifier.h> > #include <xen/sched.h> > #include <xen/smp.h> > +#include <xen/softirq.h> > #include <xen/spinlock.h> > +#include <xen/tasklet.h> > #include <xen/guest_access.h> > > #include <asm/msr.h> > #include <asm/processor.h> > +#include <asm/setup.h> > #include <asm/microcode.h> > > +static module_t __initdata ucode_mod; > +static void *(*__initdata ucode_mod_map)(const module_t *); > +static unsigned int __initdata ucode_mod_idx; > +static bool_t __initdata ucode_mod_forced; > +static cpumask_t __initdata init_mask; > + > +void __init microcode_set_module(unsigned int idx) > +{ > + ucode_mod_idx = idx; > + ucode_mod_forced = 1; > +} > + > +static void __init parse_ucode(char *s) > +{ > + if ( !ucode_mod_forced ) > + ucode_mod_idx = simple_strtoul(s, NULL, 0); > +} > +custom_param("ucode", parse_ucode); > + > +void __init microcode_grab_module( > + unsigned long *module_map, > + const multiboot_info_t *mbi, > + void *(*map)(const module_t *)) > +{ > + module_t *mod = (module_t *)__va(mbi->mods_addr); > + > + if ( !ucode_mod_idx || ucode_mod_idx >= mbi->mods_count || > + !__test_and_clear_bit(ucode_mod_idx, module_map) ) > + return; > + ucode_mod = mod[ucode_mod_idx]; > + ucode_mod_map = map; > +} > + > const struct microcode_ops *microcode_ops; > > static DEFINE_SPINLOCK(microcode_mutex); > @@ -183,6 +219,41 @@ int microcode_update(XEN_GUEST_HANDLE(co > return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info); > } > > +static void __init _do_microcode_update(unsigned long data) > +{ > + microcode_update_cpu((void *)data, ucode_mod.mod_end); > + cpumask_set_cpu(smp_processor_id(), &init_mask); > +} > + > +static int __init microcode_init(void) > +{ > + void *data; > + static struct tasklet __initdata tasklet; > + unsigned int cpu; > + > + if ( !microcode_ops || !ucode_mod.mod_end ) > + return 0; > + > + data = ucode_mod_map(&ucode_mod); > + if ( !data ) > + return -ENOMEM; > + > + softirq_tasklet_init(&tasklet, _do_microcode_update, (unsigned long)data); > + > + for_each_online_cpu ( cpu ) > + { > + tasklet_schedule_on_cpu(&tasklet, cpu); > + do { > + process_pending_softirqs(); > + } while ( !cpumask_test_cpu(cpu, &init_mask) ); > + } > + > + ucode_mod_map(NULL); > + > + return 0; > +} > +__initcall(microcode_init); > + > static int microcode_percpu_callback( > struct notifier_block *nfb, unsigned long action, void *hcpu) > { > @@ -205,7 +276,20 @@ static struct notifier_block microcode_p > static int __init microcode_presmp_init(void) > { > if ( microcode_ops ) > + { > + if ( ucode_mod.mod_end ) > + { > + void *data = ucode_mod_map(&ucode_mod); > + > + if ( data ) > + microcode_update_cpu(data, ucode_mod.mod_end); > + > + ucode_mod_map(NULL); > + } > + > register_cpu_notifier(µcode_percpu_nfb); > + } > + > return 0; > } > presmp_initcall(microcode_presmp_init); > --- a/xen/arch/x86/setup.c > +++ b/xen/arch/x86/setup.c > @@ -550,10 +550,10 @@ void __init __start_xen(unsigned long mb > { > char *memmap_type = NULL; > char *cmdline, *kextra, *loader; > - unsigned int initrdidx = 1; > + unsigned int initrdidx; > multiboot_info_t *mbi = __va(mbi_p); > module_t *mod = (module_t *)__va(mbi->mods_addr); > - unsigned long nr_pages, modules_headroom; > + unsigned long nr_pages, modules_headroom, *module_map; > int i, j, e820_warn = 0, bytes = 0; > bool_t acpi_boot_table_init_done = 0; > struct ns16550_defaults ns16550 = { > @@ -1229,7 +1229,13 @@ void __init __start_xen(unsigned long mb > > init_IRQ(); > > - xsm_init(&initrdidx, mbi, bootstrap_map); > + module_map = xmalloc_array(unsigned long, BITS_TO_LONGS(mbi->mods_count)); > + bitmap_fill(module_map, mbi->mods_count); > + __clear_bit(0, module_map); /* Dom0 kernel is always first */ > + > + xsm_init(module_map, mbi, bootstrap_map); > + > + microcode_grab_module(module_map, mbi, bootstrap_map); > > timer_init(); > > @@ -1356,6 +1362,12 @@ void __init __start_xen(unsigned long mb > if ( xen_cpuidle ) > xen_processor_pmbits |= XEN_PROCESSOR_PM_CX; > > + initrdidx = find_first_bit(module_map, mbi->mods_count); > + if ( bitmap_weight(module_map, mbi->mods_count) > 1 ) > + printk(XENLOG_WARNING > + "Multiple initrd candidates, picking module #%u\n", > + initrdidx); > + > /* > * We''re going to setup domain0 using the module(s) that we stashed safely > * above our heap. The second module, if present, is an initrd ramdisk. > --- a/xen/include/asm-x86/processor.h > +++ b/xen/include/asm-x86/processor.h > @@ -598,6 +598,7 @@ int cpuid_hypervisor_leaves( uint32_t id > int rdmsr_hypervisor_regs(uint32_t idx, uint64_t *val); > int wrmsr_hypervisor_regs(uint32_t idx, uint64_t val); > > +void microcode_set_module(unsigned int); > int microcode_update(XEN_GUEST_HANDLE(const_void), unsigned long len); > int microcode_resume_cpu(int cpu); > > --- a/xen/include/asm-x86/setup.h > +++ b/xen/include/asm-x86/setup.h > @@ -44,4 +44,7 @@ void discard_initial_images(void); > int xen_in_range(unsigned long mfn); > void arch_get_xen_caps(xen_capabilities_info_t *info); > > +void microcode_grab_module( > + unsigned long *, const multiboot_info_t *, void *(*)(const module_t *)); > + > #endif > --- a/xen/include/xsm/xsm.h > +++ b/xen/include/xsm/xsm.h > @@ -454,14 +454,15 @@ static inline long __do_xsm_op (XEN_GUES > } > > #ifdef XSM_ENABLE > -extern int xsm_init(unsigned int *initrdidx, const multiboot_info_t *mbi, > +extern int xsm_init(unsigned long *module_map, const multiboot_info_t *mbi, > void *(*bootstrap_map)(const module_t *)); > -extern int xsm_policy_init(unsigned int *initrdidx, const multiboot_info_t *mbi, > +extern int xsm_policy_init(unsigned long *module_map, > + const multiboot_info_t *mbi, > void *(*bootstrap_map)(const module_t *)); > extern int register_xsm(struct xsm_operations *ops); > extern int unregister_xsm(struct xsm_operations *ops); > #else > -static inline int xsm_init (unsigned int *initrdidx, > +static inline int xsm_init (unsigned long *module_map, > const multiboot_info_t *mbi, > void *(*bootstrap_map)(const module_t *)) > { > --- a/xen/xsm/xsm_core.c > +++ b/xen/xsm/xsm_core.c > @@ -43,7 +43,7 @@ static void __init do_xsm_initcalls(void > } > } > > -int __init xsm_init(unsigned int *initrdidx, const multiboot_info_t *mbi, > +int __init xsm_init(unsigned long *module_map, const multiboot_info_t *mbi, > void *(*bootstrap_map)(const module_t *)) > { > int ret = 0; > @@ -52,7 +52,7 @@ int __init xsm_init(unsigned int *initrd > > if ( XSM_MAGIC ) > { > - ret = xsm_policy_init(initrdidx, mbi, bootstrap_map); > + ret = xsm_policy_init(module_map, mbi, bootstrap_map); > if ( ret ) > { > bootstrap_map(NULL); > --- a/xen/xsm/xsm_policy.c > +++ b/xen/xsm/xsm_policy.c > @@ -20,11 +20,12 @@ > > #include <xsm/xsm.h> > #include <xen/multiboot.h> > +#include <asm/bitops.h> > > char *__initdata policy_buffer = NULL; > u32 __initdata policy_size = 0; > > -int xsm_policy_init(unsigned int *initrdidx, const multiboot_info_t *mbi, > +int xsm_policy_init(unsigned long *module_map, const multiboot_info_t *mbi, > void *(*bootstrap_map)(const module_t *)) > { > int i; > @@ -35,10 +36,13 @@ int xsm_policy_init(unsigned int *initrd > > /* > * Try all modules and see whichever could be the binary policy. > - * Adjust the initrdidx if module[1] is the binary policy. > + * Adjust module_map for the module that is the binary policy. > */ > for ( i = mbi->mods_count-1; i >= 1; i-- ) > { > + if ( !test_bit(i, module_map) ) > + continue; > + > _policy_start = bootstrap_map(mod + i); > _policy_len = mod[i].mod_end; > > @@ -50,8 +54,7 @@ int xsm_policy_init(unsigned int *initrd > printk("Policy len 0x%lx, start at %p.\n", > _policy_len,_policy_start); > > - if ( i == 1 ) > - *initrdidx = (mbi->mods_count > 2) ? 2 : 0; > + __clear_bit(i, module_map); > break; > > } > >> x86/microcode: enable boot time (pre-Dom0) loading > > Largely as a result of the continuing resistance of Linux maintainers > to accept a microcode loading patch for pv-ops Xen kernels, this > follows the suggested route and provides a means to load microcode > updates without the assistance of Dom0, thus also addressing eventual > problems in the hardware much earlier. > > This leverages the fact that via the multiboot protocol another blob > of data can be easily added in the form of just an extra module. Since > microcode data cannot reliably be recognized by looking at the > provided data, this requires (in the non-EFI case) the use of a > command line parameter ("ucode=<number>") to identify which of the > modules is to be parsed for an eventual microcode update (in the EFI > case the module is being identified in the config file, and hence the > command line argument, if given, will be ignored). > > This required to adjust the XSM module determination logic accordingly. > > The format of the data to be provided is the raw binary blob already > used for AMD CPUs, and the output of the intel-microcode2ucode utility > for the Intel case (either the per-(family,model,stepping) file or - > to make things easier for distro-s integration-wise - simply the > concatenation of all of them). > > In order to not convert the spin_lock() in microcode_update_cpu() (and > then obviously also all other uses on microcode_mutex) to > spin_lock_irqsave() (which would be undesirable for the hypercall > context in which the function also runs), the boot time handling gets > done using a tasklet (instead of using on_selected_cpus()). > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > --- a/xen/arch/x86/efi/boot.c > +++ b/xen/arch/x86/efi/boot.c > @@ -49,6 +49,7 @@ static UINT32 __initdata mdesc_ver; > static struct file __initdata cfg; > static struct file __initdata kernel; > static struct file __initdata ramdisk; > +static struct file __initdata ucode; > static struct file __initdata xsm; > > static multiboot_info_t __initdata mbi = { > @@ -174,6 +175,8 @@ static void __init __attribute__((__nore > efi_bs->FreePages(kernel.addr, PFN_UP(kernel.size)); > if ( ramdisk.addr ) > efi_bs->FreePages(ramdisk.addr, PFN_UP(ramdisk.size)); > + if ( ucode.addr ) > + efi_bs->FreePages(ucode.addr, PFN_UP(ucode.size)); > if ( xsm.addr ) > efi_bs->FreePages(xsm.addr, PFN_UP(xsm.size)); > > @@ -806,6 +809,17 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SY > efi_bs->FreePool(name.w); > } > > + name.s = get_value(&cfg, section.s, "ucode"); > + if ( !name.s ) > + name.s = get_value(&cfg, "global", "ucode"); > + if ( name.s ) > + { > + microcode_set_module(mbi.mods_count); > + split_value(name.s); > + read_file(dir_handle, s2w(&name), &ucode); > + efi_bs->FreePool(name.w); > + } > + > name.s = get_value(&cfg, section.s, "xsm"); > if ( name.s ) > { > --- a/xen/arch/x86/microcode.c > +++ b/xen/arch/x86/microcode.c > @@ -29,13 +29,49 @@ > #include <xen/notifier.h> > #include <xen/sched.h> > #include <xen/smp.h> > +#include <xen/softirq.h> > #include <xen/spinlock.h> > +#include <xen/tasklet.h> > #include <xen/guest_access.h> > > #include <asm/msr.h> > #include <asm/processor.h> > +#include <asm/setup.h> > #include <asm/microcode.h> > > +static module_t __initdata ucode_mod; > +static void *(*__initdata ucode_mod_map)(const module_t *); > +static unsigned int __initdata ucode_mod_idx; > +static bool_t __initdata ucode_mod_forced; > +static cpumask_t __initdata init_mask; > + > +void __init microcode_set_module(unsigned int idx) > +{ > + ucode_mod_idx = idx; > + ucode_mod_forced = 1; > +} > + > +static void __init parse_ucode(char *s) > +{ > + if ( !ucode_mod_forced ) > + ucode_mod_idx = simple_strtoul(s, NULL, 0); > +} > +custom_param("ucode", parse_ucode); > + > +void __init microcode_grab_module( > + unsigned long *module_map, > + const multiboot_info_t *mbi, > + void *(*map)(const module_t *)) > +{ > + module_t *mod = (module_t *)__va(mbi->mods_addr); > + > + if ( !ucode_mod_idx || ucode_mod_idx >= mbi->mods_count || > + !__test_and_clear_bit(ucode_mod_idx, module_map) ) > + return; > + ucode_mod = mod[ucode_mod_idx]; > + ucode_mod_map = map; > +} > + > const struct microcode_ops *microcode_ops; > > static DEFINE_SPINLOCK(microcode_mutex); > @@ -183,6 +219,41 @@ int microcode_update(XEN_GUEST_HANDLE(co > return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info); > } > > +static void __init _do_microcode_update(unsigned long data) > +{ > + microcode_update_cpu((void *)data, ucode_mod.mod_end); > + cpumask_set_cpu(smp_processor_id(), &init_mask); > +} > + > +static int __init microcode_init(void) > +{ > + void *data; > + static struct tasklet __initdata tasklet; > + unsigned int cpu; > + > + if ( !microcode_ops || !ucode_mod.mod_end ) > + return 0; > + > + data = ucode_mod_map(&ucode_mod); > + if ( !data ) > + return -ENOMEM; > + > + softirq_tasklet_init(&tasklet, _do_microcode_update, (unsigned long)data); > + > + for_each_online_cpu ( cpu ) > + { > + tasklet_schedule_on_cpu(&tasklet, cpu); > + do { > + process_pending_softirqs(); > + } while ( !cpumask_test_cpu(cpu, &init_mask) ); > + } > + > + ucode_mod_map(NULL); > + > + return 0; > +} > +__initcall(microcode_init); > + > static int microcode_percpu_callback( > struct notifier_block *nfb, unsigned long action, void *hcpu) > { > @@ -205,7 +276,20 @@ static struct notifier_block microcode_p > static int __init microcode_presmp_init(void) > { > if ( microcode_ops ) > + { > + if ( ucode_mod.mod_end ) > + { > + void *data = ucode_mod_map(&ucode_mod); > + > + if ( data ) > + microcode_update_cpu(data, ucode_mod.mod_end); > + > + ucode_mod_map(NULL); > + } > + > register_cpu_notifier(µcode_percpu_nfb); > + } > + > return 0; > } > presmp_initcall(microcode_presmp_init); > --- a/xen/arch/x86/setup.c > +++ b/xen/arch/x86/setup.c > @@ -550,10 +550,10 @@ void __init __start_xen(unsigned long mb > { > char *memmap_type = NULL; > char *cmdline, *kextra, *loader; > - unsigned int initrdidx = 1; > + unsigned int initrdidx; > multiboot_info_t *mbi = __va(mbi_p); > module_t *mod = (module_t *)__va(mbi->mods_addr); > - unsigned long nr_pages, modules_headroom; > + unsigned long nr_pages, modules_headroom, *module_map; > int i, j, e820_warn = 0, bytes = 0; > bool_t acpi_boot_table_init_done = 0; > struct ns16550_defaults ns16550 = { > @@ -1229,7 +1229,13 @@ void __init __start_xen(unsigned long mb > > init_IRQ(); > > - xsm_init(&initrdidx, mbi, bootstrap_map); > + module_map = xmalloc_array(unsigned long, BITS_TO_LONGS(mbi->mods_count)); > + bitmap_fill(module_map, mbi->mods_count); > + __clear_bit(0, module_map); /* Dom0 kernel is always first */ > + > + xsm_init(module_map, mbi, bootstrap_map); > + > + microcode_grab_module(module_map, mbi, bootstrap_map); > > timer_init(); > > @@ -1356,6 +1362,12 @@ void __init __start_xen(unsigned long mb > if ( xen_cpuidle ) > xen_processor_pmbits |= XEN_PROCESSOR_PM_CX; > > + initrdidx = find_first_bit(module_map, mbi->mods_count); > + if ( bitmap_weight(module_map, mbi->mods_count) > 1 ) > + printk(XENLOG_WARNING > + "Multiple initrd candidates, picking module #%u\n", > + initrdidx); > + > /* > * We''re going to setup domain0 using the module(s) that we stashed safely > * above our heap. The second module, if present, is an initrd ramdisk. > --- a/xen/include/asm-x86/processor.h > +++ b/xen/include/asm-x86/processor.h > @@ -598,6 +598,7 @@ int cpuid_hypervisor_leaves( uint32_t id > int rdmsr_hypervisor_regs(uint32_t idx, uint64_t *val); > int wrmsr_hypervisor_regs(uint32_t idx, uint64_t val); > > +void microcode_set_module(unsigned int); > int microcode_update(XEN_GUEST_HANDLE(const_void), unsigned long len); > int microcode_resume_cpu(int cpu); > > --- a/xen/include/asm-x86/setup.h > +++ b/xen/include/asm-x86/setup.h > @@ -44,4 +44,7 @@ void discard_initial_images(void); > int xen_in_range(unsigned long mfn); > void arch_get_xen_caps(xen_capabilities_info_t *info); > > +void microcode_grab_module( > + unsigned long *, const multiboot_info_t *, void *(*)(const module_t *)); > + > #endif > --- a/xen/include/xsm/xsm.h > +++ b/xen/include/xsm/xsm.h > @@ -454,14 +454,15 @@ static inline long __do_xsm_op (XEN_GUES > } > > #ifdef XSM_ENABLE > -extern int xsm_init(unsigned int *initrdidx, const multiboot_info_t *mbi, > +extern int xsm_init(unsigned long *module_map, const multiboot_info_t *mbi, > void *(*bootstrap_map)(const module_t *)); > -extern int xsm_policy_init(unsigned int *initrdidx, const multiboot_info_t *mbi, > +extern int xsm_policy_init(unsigned long *module_map, > + const multiboot_info_t *mbi, > void *(*bootstrap_map)(const module_t *)); > extern int register_xsm(struct xsm_operations *ops); > extern int unregister_xsm(struct xsm_operations *ops); > #else > -static inline int xsm_init (unsigned int *initrdidx, > +static inline int xsm_init (unsigned long *module_map, > const multiboot_info_t *mbi, > void *(*bootstrap_map)(const module_t *)) > { > --- a/xen/xsm/xsm_core.c > +++ b/xen/xsm/xsm_core.c > @@ -43,7 +43,7 @@ static void __init do_xsm_initcalls(void > } > } > > -int __init xsm_init(unsigned int *initrdidx, const multiboot_info_t *mbi, > +int __init xsm_init(unsigned long *module_map, const multiboot_info_t *mbi, > void *(*bootstrap_map)(const module_t *)) > { > int ret = 0; > @@ -52,7 +52,7 @@ int __init xsm_init(unsigned int *initrd > > if ( XSM_MAGIC ) > { > - ret = xsm_policy_init(initrdidx, mbi, bootstrap_map); > + ret = xsm_policy_init(module_map, mbi, bootstrap_map); > if ( ret ) > { > bootstrap_map(NULL); > --- a/xen/xsm/xsm_policy.c > +++ b/xen/xsm/xsm_policy.c > @@ -20,11 +20,12 @@ > > #include <xsm/xsm.h> > #include <xen/multiboot.h> > +#include <asm/bitops.h> > > char *__initdata policy_buffer = NULL; > u32 __initdata policy_size = 0; > > -int xsm_policy_init(unsigned int *initrdidx, const multiboot_info_t *mbi, > +int xsm_policy_init(unsigned long *module_map, const multiboot_info_t *mbi, > void *(*bootstrap_map)(const module_t *)) > { > int i; > @@ -35,10 +36,13 @@ int xsm_policy_init(unsigned int *initrd > > /* > * Try all modules and see whichever could be the binary policy. > - * Adjust the initrdidx if module[1] is the binary policy. > + * Adjust module_map for the module that is the binary policy. > */ > for ( i = mbi->mods_count-1; i >= 1; i-- ) > { > + if ( !test_bit(i, module_map) ) > + continue; > + > _policy_start = bootstrap_map(mod + i); > _policy_len = mod[i].mod_end; > > @@ -50,8 +54,7 @@ int xsm_policy_init(unsigned int *initrd > printk("Policy len 0x%lx, start at %p.\n", > _policy_len,_policy_start); > > - if ( i == 1 ) > - *initrdidx = (mbi->mods_count > 2) ? 2 : 0; > + __clear_bit(i, module_map); > break; > > }> _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel
Keir Fraser
2011-Dec-01 05:25 UTC
Re: [PATCH 2/2] x86/microcode: enable boot time (pre-Dom0) loading
On 01/12/2011 08:00, "Jan Beulich" <JBeulich@suse.com> wrote:>>>> On 30.11.11 at 10:05, Keir Fraser <keir.xen@gmail.com> wrote: >> On 30/11/2011 16:27, "Jan Beulich" <JBeulich@suse.com> wrote: >> >>> >>> In order to not convert the spin_lock() in microcode_update_cpu() (and >>> then obviously also all other uses on microcode_mutex) to >>> spin_lock_irqsave() (which would be undesirable for the hypercall >>> context in which the function also runs), the boot time handling gets >>> done using a tasklet (instead of using on_selected_cpus()). >> >> Can you explain this some more? Why would the conversion to >> spin_lock_irqsave be required when spin_lock is sufficient for current usage >> from dom0 hypercall? > > Because check_lock() wants locks to be acquired consistently?Oh I see, and we use continue_hypercall_on_cpu() in the hypercall case. Makes sense then. -- Keir> Jan >
Jan Beulich
2011-Dec-01 08:00 UTC
Re: [PATCH 2/2] x86/microcode: enable boot time (pre-Dom0) loading
>>> On 30.11.11 at 10:05, Keir Fraser <keir.xen@gmail.com> wrote: > On 30/11/2011 16:27, "Jan Beulich" <JBeulich@suse.com> wrote: > >> >> In order to not convert the spin_lock() in microcode_update_cpu() (and >> then obviously also all other uses on microcode_mutex) to >> spin_lock_irqsave() (which would be undesirable for the hypercall >> context in which the function also runs), the boot time handling gets >> done using a tasklet (instead of using on_selected_cpus()). > > Can you explain this some more? Why would the conversion to > spin_lock_irqsave be required when spin_lock is sufficient for current usage > from dom0 hypercall?Because check_lock() wants locks to be acquired consistently? Jan
Jan Beulich
2011-Dec-01 08:04 UTC
Re: [PATCH 2/2] x86/microcode: enable boot time (pre-Dom0) loading
>>> On 30.11.11 at 23:23, Konrad Rzeszutek Wilk <konrad@darnok.org> wrote: > On Wed, Nov 30, 2011 at 04:27:11PM +0000, Jan Beulich wrote: >> Largely as a result of the continuing resistance of Linux maintainers >> to accept a microcode loading patch for pv-ops Xen kernels, this >> follows the suggested route and provides a means to load microcode >> updates without the assistance of Dom0, thus also addressing eventual >> problems in the hardware much earlier. >> >> This leverages the fact that via the multiboot protocol another blob >> of data can be easily added in the form of just an extra module. Since >> microcode data cannot reliably be recognized by looking at the >> provided data, this requires (in the non-EFI case) the use of a >> command line parameter ("ucode=<number>") to identify which of the > > Well, usually there would be two modules - the kernel (which we can > identify) and the initramfs (which I guess one can also identify)? > It seems that by process of elimination we could determine that the > remaining module is the blob? Or would that be simple too dangerous > to make such assumption?For one, we must not imply that what Linux calls "initrd" isn''t used as something completely different on other possible Dom0 OSes. Hence we can''t make assumptions on the format of this module. And second, yes, I consider it too dangerous to guess on what might be the microcode blob.>> modules is to be parsed for an eventual microcode update (in the EFI >> case the module is being identified in the config file, and hence the >> command line argument, if given, will be ignored). >> >> This required to adjust the XSM module determination logic accordingly. >> >> The format of the data to be provided is the raw binary blob already >> used for AMD CPUs, and the output of the intel-microcode2ucode utility >> for the Intel case (either the per-(family,model,stepping) file or - >> to make things easier for distro-s integration-wise - simply the >> concatenation of all of them). > > There was some talk by hpa and borislav of how they wanted the payload, > but it never got finalized I think? Would it make sense to CC them on > this to see how they are planning to implement it in GRUB2?Adding them to Cc now.> I got the impression they wanted some new .pack format or so? > Or is the format that they were talking about exactly what you picked?I merely picked the binary formats that are already in use; I see no reason to invent another one. Jan
Tim Deegan
2011-Dec-01 09:48 UTC
Re: [PATCH 2/2] x86/microcode: enable boot time (pre-Dom0) loading
At 18:23 -0400 on 30 Nov (1322677439), Konrad Rzeszutek Wilk wrote:> On Wed, Nov 30, 2011 at 04:27:11PM +0000, Jan Beulich wrote: > > Largely as a result of the continuing resistance of Linux maintainers > > to accept a microcode loading patch for pv-ops Xen kernels, this > > follows the suggested route and provides a means to load microcode > > updates without the assistance of Dom0, thus also addressing eventual > > problems in the hardware much earlier. > > > > This leverages the fact that via the multiboot protocol another blob > > of data can be easily added in the form of just an extra module. Since > > microcode data cannot reliably be recognized by looking at the > > provided data, this requires (in the non-EFI case) the use of a > > command line parameter ("ucode=<number>") to identify which of the > > Well, usually there would be two modules - the kernel (which we can > identify) and the initramfs (which I guess one can also identify)? > It seems that by process of elimination we could determine that the > remaining module is the blob?I''d really rather not do it that way - what if you want no initrd but do have a microcode update? :) And what if we want to add another special-purpose multiboot module later? It might be nice if that number could take negative values, though - i.e. ucode=-1 to say ''use the last module in the list''. Then the user could adjust the dom0 modules without needing to recalculate. Tim.
Jan Beulich
2011-Dec-01 09:54 UTC
Re: [PATCH 2/2] x86/microcode: enable boot time (pre-Dom0) loading
>>> On 01.12.11 at 10:48, Tim Deegan <tim@xen.org> wrote: > It might be nice if that number could take negative values, though - > i.e. ucode=-1 to say ''use the last module in the list''. Then the user > could adjust the dom0 modules without needing to recalculate.That''s certainly a nice enhancement; I''ll put this on my to-to list. Jan
Ian Campbell
2011-Dec-01 09:55 UTC
Re: [PATCH 2/2] x86/microcode: enable boot time (pre-Dom0) loading
On Thu, 2011-12-01 at 08:04 +0000, Jan Beulich wrote:> > > I got the impression they wanted some new .pack format or so? > > Or is the format that they were talking about exactly what you > picked? > > I merely picked the binary formats that are already in use; I see no > reason to invent another one.Adding a header/magic number so you can detect which of the blobs is microcode? Ian.
Jan Beulich
2011-Dec-01 10:02 UTC
Re: [PATCH 2/2] x86/microcode: enable boot time (pre-Dom0) loading
>>> On 01.12.11 at 10:55, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Thu, 2011-12-01 at 08:04 +0000, Jan Beulich wrote: >> >> > I got the impression they wanted some new .pack format or so? >> > Or is the format that they were talking about exactly what you >> picked? >> >> I merely picked the binary formats that are already in use; I see no >> reason to invent another one. > > Adding a header/magic number so you can detect which of the blobs is > microcode?This is precisely what I did *not* want to do. Jan
Borislav Petkov
2011-Dec-01 10:38 UTC
Re: [PATCH 2/2] x86/microcode: enable boot time (pre-Dom0) loading
On Thu, Dec 01, 2011 at 10:02:10AM +0000, Jan Beulich wrote:> >>> On 01.12.11 at 10:55, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > On Thu, 2011-12-01 at 08:04 +0000, Jan Beulich wrote: > >> > >> > I got the impression they wanted some new .pack format or so? > >> > Or is the format that they were talking about exactly what you > >> picked? > >> > >> I merely picked the binary formats that are already in use; I see no > >> reason to invent another one. > > > > Adding a header/magic number so you can detect which of the blobs is > > microcode? > > This is precisely what I did *not* want to do.Well, AFAICR, we talked about using the setup_data linked list in the user-mode kernel header and since parse_setup_data() looks at data->type, then probably something should state the type of the ucode image, but I''m not sure on the details. FWIW, hpa mentioned at KS he already has some code doing early ucode loading so I''ll let him chime in here. He''s away this week though so it could take a while. -- Regards/Gruss, Boris. Advanced Micro Devices GmbH Einsteinring 24, 85609 Dornach GM: Alberto Bozzo Reg: Dornach, Landkreis Muenchen HRB Nr. 43632 WEEE Registernr: 129 19551
Konrad Rzeszutek Wilk
2011-Dec-13 16:42 UTC
Re: [PATCH 2/2] x86/microcode: enable boot time (pre-Dom0) loading
On Thu, Dec 01, 2011 at 11:38:08AM +0100, Borislav Petkov wrote:> On Thu, Dec 01, 2011 at 10:02:10AM +0000, Jan Beulich wrote: > > >>> On 01.12.11 at 10:55, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > > On Thu, 2011-12-01 at 08:04 +0000, Jan Beulich wrote: > > >> > > >> > I got the impression they wanted some new .pack format or so? > > >> > Or is the format that they were talking about exactly what you > > >> picked? > > >> > > >> I merely picked the binary formats that are already in use; I see no > > >> reason to invent another one. > > > > > > Adding a header/magic number so you can detect which of the blobs is > > > microcode? > > > > This is precisely what I did *not* want to do. > > Well, AFAICR, we talked about using the setup_data linked list in > the user-mode kernel header and since parse_setup_data() looks at > data->type, then probably something should state the type of the ucode > image, but I''m not sure on the details. > > FWIW, hpa mentioned at KS he already has some code doing early ucode > loading so I''ll let him chime in here. He''s away this week though so it > could take a while.ping?