Juergen Gross
2017-May-19 15:47 UTC
[PATCH 00/10] paravirt: make amount of paravirtualization configurable
Today paravirtualization is a all-or-nothing game: either a kernel is compiled with no paravirtualization support at all, or it is supporting paravirtualized environments like Xen pv-guests or lguest additionally to some paravirtualized tuning for KVM, Hyperv, VMWare or Xen HVM-guests. As support of pv-guests requires quite intrusive pv-hooks (e.g. all access functions to page table entries, privileged instructions) it is desirable to enable those hooks only in cases where support of pv-guests is really desired. With splitting up of Xen guest support into pv-guest support and support for HVM-guests it is now possible to do the same for support of paravirtualization: only if XEN_PV or LGUEST_GUEST are configured full paravirtualization is required. This patch series carves out pv-guest support form PARAVIRT by introducing PARAVIRT_FULL config option selected by XEN_PV and LGUEST_GUEST config options. The series has been tested for 32 and 64 bit kernels without PARAVIRT, with PARAVIRT, and with PARAVIRT + PARAVIRT_FULL configured. Juergen Gross (10): x86: remove stale prototype from arch/x86/include/asm/pgalloc.h paravirt: remove unused function paravirt_disable_iospace() xen: move interrupt handling for pv guests under CONFIG_XEN_PV umbrella xen: remove non-pv test from arch/x86/xen/irq.c paravirt: add new PARAVIRT_FULL config item paravirt: split pv_cpu_ops for support of PARAVIRT_FULL paravirt: split pv_irq_ops for support of PARAVIRT_FULL paravirt: split pv_mmu_ops for support of PARAVIRT_FULL paravirt: split pv_info for support of PARAVIRT_FULL paravirt: merge pv_ops_* structures into one MAINTAINERS | 2 +- arch/x86/Kconfig | 4 + arch/x86/boot/compressed/misc.h | 1 + arch/x86/entry/entry_32.S | 4 +- arch/x86/entry/entry_64.S | 10 +- arch/x86/include/asm/debugreg.h | 2 +- arch/x86/include/asm/desc.h | 4 +- arch/x86/include/asm/fixmap.h | 2 +- arch/x86/include/asm/irqflags.h | 40 +- arch/x86/include/asm/mmu_context.h | 4 +- arch/x86/include/asm/msr.h | 4 +- arch/x86/include/asm/paravirt.h | 738 ++-------------------------- arch/x86/include/asm/paravirt_full.h | 714 +++++++++++++++++++++++++++ arch/x86/include/asm/paravirt_types.h | 243 +-------- arch/x86/include/asm/paravirt_types_full.h | 218 ++++++++ arch/x86/include/asm/pgalloc.h | 4 +- arch/x86/include/asm/pgtable-3level_types.h | 4 +- arch/x86/include/asm/pgtable.h | 8 +- arch/x86/include/asm/processor.h | 4 +- arch/x86/include/asm/ptrace.h | 5 +- arch/x86/include/asm/segment.h | 2 +- arch/x86/include/asm/special_insns.h | 25 +- arch/x86/include/asm/tlbflush.h | 2 +- arch/x86/kernel/Makefile | 1 + arch/x86/kernel/alternative.c | 4 +- arch/x86/kernel/asm-offsets.c | 21 +- arch/x86/kernel/asm-offsets_64.c | 9 +- arch/x86/kernel/cpu/common.c | 4 +- arch/x86/kernel/cpu/vmware.c | 6 +- arch/x86/kernel/head_64.S | 2 +- arch/x86/kernel/kvm.c | 6 +- arch/x86/kernel/kvmclock.c | 6 +- arch/x86/kernel/paravirt.c | 303 +----------- arch/x86/kernel/paravirt_full.c | 277 +++++++++++ arch/x86/kernel/paravirt_patch_32.c | 36 +- arch/x86/kernel/paravirt_patch_64.c | 50 +- arch/x86/kernel/tsc.c | 2 +- arch/x86/kernel/vsmp_64.c | 18 +- arch/x86/lguest/Kconfig | 1 + arch/x86/lguest/boot.c | 100 ++-- arch/x86/xen/Kconfig | 1 + arch/x86/xen/Makefile | 8 +- arch/x86/xen/enlighten_hvm.c | 4 +- arch/x86/xen/enlighten_pv.c | 58 ++- arch/x86/xen/irq.c | 15 +- arch/x86/xen/mmu_hvm.c | 2 +- arch/x86/xen/mmu_pv.c | 34 +- arch/x86/xen/time.c | 11 +- drivers/xen/time.c | 2 +- 49 files changed, 1548 insertions(+), 1477 deletions(-) create mode 100644 arch/x86/include/asm/paravirt_full.h create mode 100644 arch/x86/include/asm/paravirt_types_full.h create mode 100644 arch/x86/kernel/paravirt_full.c -- 2.12.0
Juergen Gross
2017-May-19 15:47 UTC
[PATCH 01/10] x86: remove stale prototype from arch/x86/include/asm/pgalloc.h
paravirt_alloc_pmd_clone() doesn't exist anywhere. Remove its prototype. Signed-off-by: Juergen Gross <jgross at suse.com> --- arch/x86/include/asm/pgalloc.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h index b2d0cd8288aa..71de65bb1791 100644 --- a/arch/x86/include/asm/pgalloc.h +++ b/arch/x86/include/asm/pgalloc.h @@ -14,8 +14,6 @@ static inline int __paravirt_pgd_alloc(struct mm_struct *mm) { return 0; } static inline void paravirt_pgd_free(struct mm_struct *mm, pgd_t *pgd) {} static inline void paravirt_alloc_pte(struct mm_struct *mm, unsigned long pfn) {} static inline void paravirt_alloc_pmd(struct mm_struct *mm, unsigned long pfn) {} -static inline void paravirt_alloc_pmd_clone(unsigned long pfn, unsigned long clonepfn, - unsigned long start, unsigned long count) {} static inline void paravirt_alloc_pud(struct mm_struct *mm, unsigned long pfn) {} static inline void paravirt_alloc_p4d(struct mm_struct *mm, unsigned long pfn) {} static inline void paravirt_release_pte(unsigned long pfn) {} -- 2.12.0
Juergen Gross
2017-May-19 15:47 UTC
[PATCH 02/10] paravirt: remove unused function paravirt_disable_iospace()
paravirt_disable_iospace() isn't used anywhere. Remove it. Signed-off-by: Juergen Gross <jgross at suse.com> --- arch/x86/include/asm/paravirt_types.h | 2 -- arch/x86/kernel/paravirt.c | 19 ------------------- 2 files changed, 21 deletions(-) diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h index 7465d6fe336f..7a5de42cb465 100644 --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -396,8 +396,6 @@ unsigned paravirt_patch_insns(void *insnbuf, unsigned len, unsigned native_patch(u8 type, u16 clobbers, void *ibuf, unsigned long addr, unsigned len); -int paravirt_disable_iospace(void); - /* * This generates an indirect call based on the operation type number. * The type number, computed in PARAVIRT_PATCH, is derived from the diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index 3586996fc50d..b8b23b3f24c2 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -207,25 +207,6 @@ static u64 native_steal_clock(int cpu) extern void native_iret(void); extern void native_usergs_sysret64(void); -static struct resource reserve_ioports = { - .start = 0, - .end = IO_SPACE_LIMIT, - .name = "paravirt-ioport", - .flags = IORESOURCE_IO | IORESOURCE_BUSY, -}; - -/* - * Reserve the whole legacy IO space to prevent any legacy drivers - * from wasting time probing for their hardware. This is a fairly - * brute-force approach to disabling all non-virtual drivers. - * - * Note that this must be called very early to have any effect. - */ -int paravirt_disable_iospace(void) -{ - return request_resource(&ioport_resource, &reserve_ioports); -} - static DEFINE_PER_CPU(enum paravirt_lazy_mode, paravirt_lazy_mode) = PARAVIRT_LAZY_NONE; static inline void enter_lazy(enum paravirt_lazy_mode mode) -- 2.12.0
Juergen Gross
2017-May-19 15:47 UTC
[PATCH 03/10] xen: move interrupt handling for pv guests under CONFIG_XEN_PV umbrella
There is no need to include pv-guest only object files in a kernel not configured to support those. Move Xen's irq.o, xen-asm*.o and pv parts of entry_*.o into CONFIG_XEN_PV sections. Signed-off-by: Juergen Gross <jgross at suse.com> --- arch/x86/entry/entry_32.S | 4 +++- arch/x86/entry/entry_64.S | 6 ++++-- arch/x86/xen/Makefile | 8 ++++---- 3 files changed, 11 insertions(+), 7 deletions(-) diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S index 50bc26949e9e..37ae4a7809d9 100644 --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -789,7 +789,7 @@ ENTRY(spurious_interrupt_bug) jmp common_exception END(spurious_interrupt_bug) -#ifdef CONFIG_XEN +#ifdef CONFIG_XEN_PV ENTRY(xen_hypervisor_callback) pushl $-1 /* orig_ax = -1 => not a system call */ SAVE_ALL @@ -870,7 +870,9 @@ ENTRY(xen_failsafe_callback) _ASM_EXTABLE(3b, 8b) _ASM_EXTABLE(4b, 9b) ENDPROC(xen_failsafe_callback) +#endif /* CONFIG_XEN_PV */ +#ifdef CONFIG_XEN BUILD_INTERRUPT3(xen_hvm_callback_vector, HYPERVISOR_CALLBACK_VECTOR, xen_evtchn_do_upcall) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 607d72c4a485..cd47214ff402 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -902,7 +902,7 @@ ENTRY(do_softirq_own_stack) ret END(do_softirq_own_stack) -#ifdef CONFIG_XEN +#ifdef CONFIG_XEN_PV idtentry xen_hypervisor_callback xen_do_hypervisor_callback has_error_code=0 /* @@ -983,7 +983,9 @@ ENTRY(xen_failsafe_callback) ENCODE_FRAME_POINTER jmp error_exit END(xen_failsafe_callback) +#endif /* CONFIG_XEN_PV */ +#ifdef CONFIG_XEN apicinterrupt3 HYPERVISOR_CALLBACK_VECTOR \ xen_hvm_callback_vector xen_evtchn_do_upcall @@ -998,7 +1000,7 @@ idtentry debug do_debug has_error_code=0 paranoid=1 shift_ist=DEBUG_STACK idtentry int3 do_int3 has_error_code=0 paranoid=1 shift_ist=DEBUG_STACK idtentry stack_segment do_stack_segment has_error_code=1 -#ifdef CONFIG_XEN +#ifdef CONFIG_XEN_PV idtentry xen_debug do_debug has_error_code=0 idtentry xen_int3 do_int3 has_error_code=0 idtentry xen_stack_segment do_stack_segment has_error_code=1 diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile index fffb0a16f9e3..5fc463eaafff 100644 --- a/arch/x86/xen/Makefile +++ b/arch/x86/xen/Makefile @@ -10,13 +10,13 @@ nostackp := $(call cc-option, -fno-stack-protector) CFLAGS_enlighten_pv.o := $(nostackp) CFLAGS_mmu_pv.o := $(nostackp) -obj-y := enlighten.o multicalls.o mmu.o irq.o \ - time.o xen-asm.o xen-asm_$(BITS).o \ +obj-y := enlighten.o multicalls.o mmu.o time.o \ grant-table.o suspend.o platform-pci-unplug.o obj-$(CONFIG_XEN_PVHVM) += enlighten_hvm.o mmu_hvm.o suspend_hvm.o -obj-$(CONFIG_XEN_PV) += setup.o apic.o pmu.o suspend_pv.o \ - p2m.o enlighten_pv.o mmu_pv.o +obj-$(CONFIG_XEN_PV) += setup.o apic.o pmu.o suspend_pv.o irq.o \ + p2m.o enlighten_pv.o mmu_pv.o \ + xen-asm.o xen-asm_$(BITS).o obj-$(CONFIG_XEN_PVH) += enlighten_pvh.o obj-$(CONFIG_EVENT_TRACING) += trace.o -- 2.12.0
Juergen Gross
2017-May-19 15:47 UTC
[PATCH 04/10] xen: remove non-pv test from arch/x86/xen/irq.c
As arch/x86/xen/irq.c is used for pv-guests only, there is no need to have a test targeting a HVM guest in it. Remove it. Signed-off-by: Juergen Gross <jgross at suse.com> --- arch/x86/xen/irq.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c index 33e92955e09d..3b55ae664521 100644 --- a/arch/x86/xen/irq.c +++ b/arch/x86/xen/irq.c @@ -130,8 +130,6 @@ static const struct pv_irq_ops xen_irq_ops __initconst = { void __init xen_init_irq_ops(void) { - /* For PVH we use default pv_irq_ops settings. */ - if (!xen_feature(XENFEAT_hvm_callback_vector)) - pv_irq_ops = xen_irq_ops; + pv_irq_ops = xen_irq_ops; x86_init.irqs.intr_init = xen_init_IRQ; } -- 2.12.0
Juergen Gross
2017-May-19 15:47 UTC
[PATCH 05/10] paravirt: add new PARAVIRT_FULL config item
Add a new config item PARAVIRT_FULL. It will be used to guard the pv_*_ops functions used by fully paravirtualized guests (Xen pv-guests and lguest) only. Kernels not meant to support those guest types will be able to use many operations without paravirt abstraction while still supporting all the other paravirt features. For now just add the new Kconfig option and select it for XEN_PV and LGUEST_GUEST. Add paravirt_full.c, paravirt_full.h and paravirt_types_full.h which will contain the necessary implementation parts of the pv guest specific paravirt functions. Signed-off-by: Juergen Gross <jgross at suse.com> --- MAINTAINERS | 2 +- arch/x86/Kconfig | 4 ++++ arch/x86/include/asm/paravirt.h | 8 ++++++++ arch/x86/include/asm/paravirt_full.h | 4 ++++ arch/x86/include/asm/paravirt_types.h | 4 ++++ arch/x86/include/asm/paravirt_types_full.h | 4 ++++ arch/x86/kernel/Makefile | 1 + arch/x86/kernel/paravirt_full.c | 16 ++++++++++++++++ arch/x86/lguest/Kconfig | 1 + arch/x86/xen/Kconfig | 1 + 10 files changed, 44 insertions(+), 1 deletion(-) create mode 100644 arch/x86/include/asm/paravirt_full.h create mode 100644 arch/x86/include/asm/paravirt_types_full.h create mode 100644 arch/x86/kernel/paravirt_full.c diff --git a/MAINTAINERS b/MAINTAINERS index f7d568b8f133..8f22d1cd10a8 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -9644,7 +9644,7 @@ L: virtualization at lists.linux-foundation.org S: Supported F: Documentation/virtual/paravirt_ops.txt F: arch/*/kernel/paravirt* -F: arch/*/include/asm/paravirt.h +F: arch/*/include/asm/paravirt*.h F: include/linux/hypervisor.h PARIDE DRIVERS FOR PARALLEL PORT IDE DEVICES diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index cd18994a9555..4d032ed27ce7 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -738,6 +738,10 @@ config PARAVIRT_SPINLOCKS If you are unsure how to answer this question, answer Y. +config PARAVIRT_FULL + bool + depends on PARAVIRT + config QUEUED_LOCK_STAT bool "Paravirt queued spinlock statistics" depends on PARAVIRT_SPINLOCKS && DEBUG_FS diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index 55fa56fe4e45..419a3b991e72 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -15,6 +15,10 @@ #include <linux/cpumask.h> #include <asm/frame.h> +#ifdef CONFIG_PARAVIRT_FULL +#include <asm/paravirt_full.h> +#endif + static inline void load_sp0(struct tss_struct *tss, struct thread_struct *thread) { @@ -916,6 +920,10 @@ extern void default_banner(void); #define PARA_INDIRECT(addr) *%cs:addr #endif +#ifdef CONFIG_PARAVIRT_FULL +#include <asm/paravirt_full.h> +#endif + #define INTERRUPT_RETURN \ PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_iret), CLBR_NONE, \ jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_iret)) diff --git a/arch/x86/include/asm/paravirt_full.h b/arch/x86/include/asm/paravirt_full.h new file mode 100644 index 000000000000..1cabcfff6791 --- /dev/null +++ b/arch/x86/include/asm/paravirt_full.h @@ -0,0 +1,4 @@ +#ifndef _ASM_X86_PARAVIRT_FULL_H +#define _ASM_X86_PARAVIRT_FULL_H + +#endif /* _ASM_X86_PARAVIRT_FULL_H */ diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h index 7a5de42cb465..dbb0e69cd5c6 100644 --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -60,6 +60,10 @@ struct paravirt_callee_save { void *func; }; +#ifdef CONFIG_PARAVIRT_FULL +#include <asm/paravirt_types_full.h> +#endif + /* general info */ struct pv_info { unsigned int kernel_rpl; diff --git a/arch/x86/include/asm/paravirt_types_full.h b/arch/x86/include/asm/paravirt_types_full.h new file mode 100644 index 000000000000..69c048324e70 --- /dev/null +++ b/arch/x86/include/asm/paravirt_types_full.h @@ -0,0 +1,4 @@ +#ifndef _ASM_X86_PARAVIRT_TYPES_FULL_H +#define _ASM_X86_PARAVIRT_TYPES_FULL_H + +#endif /* _ASM_X86_PARAVIRT_TYPES_FULL_H */ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 4b994232cb57..80fe640e9b63 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -107,6 +107,7 @@ obj-$(CONFIG_KVM_GUEST) += kvm.o kvmclock.o obj-$(CONFIG_PARAVIRT) += paravirt.o paravirt_patch_$(BITS).o obj-$(CONFIG_PARAVIRT_SPINLOCKS)+= paravirt-spinlocks.o obj-$(CONFIG_PARAVIRT_CLOCK) += pvclock.o +obj-$(CONFIG_PARAVIRT_FULL) += paravirt_full.o obj-$(CONFIG_X86_PMEM_LEGACY_DEVICE) += pmem.o obj-$(CONFIG_PCSPKR_PLATFORM) += pcspeaker.o diff --git a/arch/x86/kernel/paravirt_full.c b/arch/x86/kernel/paravirt_full.c new file mode 100644 index 000000000000..0c7de64129c5 --- /dev/null +++ b/arch/x86/kernel/paravirt_full.c @@ -0,0 +1,16 @@ +/* + Paravirtualization interfaces for fully paravirtualized guests + Copyright (C) 2017 Juergen Gross SUSE Linux GmbH + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. +*/ + +#include <asm/paravirt.h> diff --git a/arch/x86/lguest/Kconfig b/arch/x86/lguest/Kconfig index 08f41caada45..ce2c6ee56921 100644 --- a/arch/x86/lguest/Kconfig +++ b/arch/x86/lguest/Kconfig @@ -1,6 +1,7 @@ config LGUEST_GUEST bool "Lguest guest support" depends on X86_32 && PARAVIRT && PCI + select PARAVIRT_FULL select TTY select VIRTUALIZATION select VIRTIO diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig index 027987638e98..c4177773df81 100644 --- a/arch/x86/xen/Kconfig +++ b/arch/x86/xen/Kconfig @@ -17,6 +17,7 @@ config XEN_PV bool "Xen PV guest support" default y depends on XEN + select PARAVIRT_FULL select XEN_HAVE_PVMMU select XEN_HAVE_VPMU help -- 2.12.0
Juergen Gross
2017-May-19 15:47 UTC
[PATCH 06/10] paravirt: split pv_cpu_ops for support of PARAVIRT_FULL
Move functions needed for fully paravirtualized guests only into a new structure pvfull_cpu_ops in paravirt_types_full.h, paravirt_full.h and the associated vector into paravirt_full.c. Signed-off-by: Juergen Gross <jgross at suse.com> --- arch/x86/entry/entry_64.S | 4 +- arch/x86/include/asm/debugreg.h | 2 +- arch/x86/include/asm/desc.h | 4 +- arch/x86/include/asm/irqflags.h | 16 +- arch/x86/include/asm/msr.h | 4 +- arch/x86/include/asm/paravirt.h | 257 +-------------------------- arch/x86/include/asm/paravirt_full.h | 269 +++++++++++++++++++++++++++++ arch/x86/include/asm/paravirt_types.h | 78 +-------- arch/x86/include/asm/paravirt_types_full.h | 78 +++++++++ arch/x86/include/asm/pgtable.h | 8 +- arch/x86/include/asm/processor.h | 4 +- arch/x86/include/asm/special_insns.h | 27 +-- arch/x86/kernel/asm-offsets.c | 9 +- arch/x86/kernel/asm-offsets_64.c | 6 +- arch/x86/kernel/cpu/common.c | 4 +- arch/x86/kernel/paravirt.c | 67 +------ arch/x86/kernel/paravirt_full.c | 66 +++++++ arch/x86/kernel/paravirt_patch_32.c | 8 +- arch/x86/kernel/paravirt_patch_64.c | 18 +- arch/x86/lguest/boot.c | 38 ++-- arch/x86/xen/enlighten_pv.c | 14 +- 21 files changed, 522 insertions(+), 459 deletions(-) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index cd47214ff402..4e85e9c9a2f8 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -42,12 +42,12 @@ .code64 .section .entry.text, "ax" -#ifdef CONFIG_PARAVIRT +#ifdef CONFIG_PARAVIRT_FULL ENTRY(native_usergs_sysret64) swapgs sysretq ENDPROC(native_usergs_sysret64) -#endif /* CONFIG_PARAVIRT */ +#endif /* CONFIG_PARAVIRT_FULL */ .macro TRACE_IRQS_IRETQ #ifdef CONFIG_TRACE_IRQFLAGS diff --git a/arch/x86/include/asm/debugreg.h b/arch/x86/include/asm/debugreg.h index 12cb66f6d3a5..6477da0e4869 100644 --- a/arch/x86/include/asm/debugreg.h +++ b/arch/x86/include/asm/debugreg.h @@ -7,7 +7,7 @@ DECLARE_PER_CPU(unsigned long, cpu_dr7); -#ifndef CONFIG_PARAVIRT +#ifndef CONFIG_PARAVIRT_FULL /* * These special macros can be used to get or set a debugging register */ diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h index d0a21b12dd58..be2037db49a8 100644 --- a/arch/x86/include/asm/desc.h +++ b/arch/x86/include/asm/desc.h @@ -118,7 +118,7 @@ static inline int desc_empty(const void *ptr) return !(desc[0] | desc[1]); } -#ifdef CONFIG_PARAVIRT +#ifdef CONFIG_PARAVIRT_FULL #include <asm/paravirt.h> #else #define load_TR_desc() native_load_tr_desc() @@ -145,7 +145,7 @@ static inline void paravirt_alloc_ldt(struct desc_struct *ldt, unsigned entries) static inline void paravirt_free_ldt(struct desc_struct *ldt, unsigned entries) { } -#endif /* CONFIG_PARAVIRT */ +#endif /* CONFIG_PARAVIRT_FULL */ #define store_ldt(ldt) asm("sldt %0" : "=m"(ldt)) diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h index ac7692dcfa2e..c3319c20127c 100644 --- a/arch/x86/include/asm/irqflags.h +++ b/arch/x86/include/asm/irqflags.h @@ -119,6 +119,16 @@ static inline notrace unsigned long arch_local_irq_save(void) #define DISABLE_INTERRUPTS(x) cli #ifdef CONFIG_X86_64 +#define PARAVIRT_ADJUST_EXCEPTION_FRAME /* */ +#endif + +#endif /* __ASSEMBLY__ */ +#endif /* CONFIG_PARAVIRT */ + +#ifndef CONFIG_PARAVIRT_FULL +#ifdef __ASSEMBLY__ + +#ifdef CONFIG_X86_64 #define SWAPGS swapgs /* * Currently paravirt can't handle swapgs nicely when we @@ -131,8 +141,6 @@ static inline notrace unsigned long arch_local_irq_save(void) */ #define SWAPGS_UNSAFE_STACK swapgs -#define PARAVIRT_ADJUST_EXCEPTION_FRAME /* */ - #define INTERRUPT_RETURN jmp native_iret #define USERGS_SYSRET64 \ swapgs; \ @@ -143,13 +151,11 @@ static inline notrace unsigned long arch_local_irq_save(void) #else #define INTERRUPT_RETURN iret -#define ENABLE_INTERRUPTS_SYSEXIT sti; sysexit #define GET_CR0_INTO_EAX movl %cr0, %eax #endif - #endif /* __ASSEMBLY__ */ -#endif /* CONFIG_PARAVIRT */ +#endif /* CONFIG_PARAVIRT_FULL */ #ifndef __ASSEMBLY__ static inline int arch_irqs_disabled_flags(unsigned long flags) diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h index 898dba2e2e2c..7c715f811590 100644 --- a/arch/x86/include/asm/msr.h +++ b/arch/x86/include/asm/msr.h @@ -231,7 +231,7 @@ static inline unsigned long long native_read_pmc(int counter) return EAX_EDX_VAL(val, low, high); } -#ifdef CONFIG_PARAVIRT +#ifdef CONFIG_PARAVIRT_FULL #include <asm/paravirt.h> #else #include <linux/errno.h> @@ -294,7 +294,7 @@ do { \ #define rdpmcl(counter, val) ((val) = native_read_pmc(counter)) -#endif /* !CONFIG_PARAVIRT */ +#endif /* !CONFIG_PARAVIRT_FULL */ /* * 64-bit version of wrmsr_safe(): diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index 419a3b991e72..2287a2465486 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -19,42 +19,6 @@ #include <asm/paravirt_full.h> #endif -static inline void load_sp0(struct tss_struct *tss, - struct thread_struct *thread) -{ - PVOP_VCALL2(pv_cpu_ops.load_sp0, tss, thread); -} - -/* The paravirtualized CPUID instruction. */ -static inline void __cpuid(unsigned int *eax, unsigned int *ebx, - unsigned int *ecx, unsigned int *edx) -{ - PVOP_VCALL4(pv_cpu_ops.cpuid, eax, ebx, ecx, edx); -} - -/* - * These special macros can be used to get or set a debugging register - */ -static inline unsigned long paravirt_get_debugreg(int reg) -{ - return PVOP_CALL1(unsigned long, pv_cpu_ops.get_debugreg, reg); -} -#define get_debugreg(var, reg) var = paravirt_get_debugreg(reg) -static inline void set_debugreg(unsigned long val, int reg) -{ - PVOP_VCALL2(pv_cpu_ops.set_debugreg, reg, val); -} - -static inline unsigned long read_cr0(void) -{ - return PVOP_CALL0(unsigned long, pv_cpu_ops.read_cr0); -} - -static inline void write_cr0(unsigned long x) -{ - PVOP_VCALL1(pv_cpu_ops.write_cr0, x); -} - static inline unsigned long read_cr2(void) { return PVOP_CALL0(unsigned long, pv_mmu_ops.read_cr2); @@ -75,28 +39,6 @@ static inline void write_cr3(unsigned long x) PVOP_VCALL1(pv_mmu_ops.write_cr3, x); } -static inline unsigned long __read_cr4(void) -{ - return PVOP_CALL0(unsigned long, pv_cpu_ops.read_cr4); -} - -static inline void __write_cr4(unsigned long x) -{ - PVOP_VCALL1(pv_cpu_ops.write_cr4, x); -} - -#ifdef CONFIG_X86_64 -static inline unsigned long read_cr8(void) -{ - return PVOP_CALL0(unsigned long, pv_cpu_ops.read_cr8); -} - -static inline void write_cr8(unsigned long x) -{ - PVOP_VCALL1(pv_cpu_ops.write_cr8, x); -} -#endif - static inline void arch_safe_halt(void) { PVOP_VCALL0(pv_irq_ops.safe_halt); @@ -107,77 +49,8 @@ static inline void halt(void) PVOP_VCALL0(pv_irq_ops.halt); } -static inline void wbinvd(void) -{ - PVOP_VCALL0(pv_cpu_ops.wbinvd); -} - #define get_kernel_rpl() (pv_info.kernel_rpl) -static inline u64 paravirt_read_msr(unsigned msr) -{ - return PVOP_CALL1(u64, pv_cpu_ops.read_msr, msr); -} - -static inline void paravirt_write_msr(unsigned msr, - unsigned low, unsigned high) -{ - return PVOP_VCALL3(pv_cpu_ops.write_msr, msr, low, high); -} - -static inline u64 paravirt_read_msr_safe(unsigned msr, int *err) -{ - return PVOP_CALL2(u64, pv_cpu_ops.read_msr_safe, msr, err); -} - -static inline int paravirt_write_msr_safe(unsigned msr, - unsigned low, unsigned high) -{ - return PVOP_CALL3(int, pv_cpu_ops.write_msr_safe, msr, low, high); -} - -#define rdmsr(msr, val1, val2) \ -do { \ - u64 _l = paravirt_read_msr(msr); \ - val1 = (u32)_l; \ - val2 = _l >> 32; \ -} while (0) - -#define wrmsr(msr, val1, val2) \ -do { \ - paravirt_write_msr(msr, val1, val2); \ -} while (0) - -#define rdmsrl(msr, val) \ -do { \ - val = paravirt_read_msr(msr); \ -} while (0) - -static inline void wrmsrl(unsigned msr, u64 val) -{ - wrmsr(msr, (u32)val, (u32)(val>>32)); -} - -#define wrmsr_safe(msr, a, b) paravirt_write_msr_safe(msr, a, b) - -/* rdmsr with exception handling */ -#define rdmsr_safe(msr, a, b) \ -({ \ - int _err; \ - u64 _l = paravirt_read_msr_safe(msr, &_err); \ - (*a) = (u32)_l; \ - (*b) = _l >> 32; \ - _err; \ -}) - -static inline int rdmsrl_safe(unsigned msr, unsigned long long *p) -{ - int err; - - *p = paravirt_read_msr_safe(msr, &err); - return err; -} - static inline unsigned long long paravirt_sched_clock(void) { return PVOP_CALL0(unsigned long long, pv_time_ops.sched_clock); @@ -192,88 +65,6 @@ static inline u64 paravirt_steal_clock(int cpu) return PVOP_CALL1(u64, pv_time_ops.steal_clock, cpu); } -static inline unsigned long long paravirt_read_pmc(int counter) -{ - return PVOP_CALL1(u64, pv_cpu_ops.read_pmc, counter); -} - -#define rdpmc(counter, low, high) \ -do { \ - u64 _l = paravirt_read_pmc(counter); \ - low = (u32)_l; \ - high = _l >> 32; \ -} while (0) - -#define rdpmcl(counter, val) ((val) = paravirt_read_pmc(counter)) - -static inline void paravirt_alloc_ldt(struct desc_struct *ldt, unsigned entries) -{ - PVOP_VCALL2(pv_cpu_ops.alloc_ldt, ldt, entries); -} - -static inline void paravirt_free_ldt(struct desc_struct *ldt, unsigned entries) -{ - PVOP_VCALL2(pv_cpu_ops.free_ldt, ldt, entries); -} - -static inline void load_TR_desc(void) -{ - PVOP_VCALL0(pv_cpu_ops.load_tr_desc); -} -static inline void load_gdt(const struct desc_ptr *dtr) -{ - PVOP_VCALL1(pv_cpu_ops.load_gdt, dtr); -} -static inline void load_idt(const struct desc_ptr *dtr) -{ - PVOP_VCALL1(pv_cpu_ops.load_idt, dtr); -} -static inline void set_ldt(const void *addr, unsigned entries) -{ - PVOP_VCALL2(pv_cpu_ops.set_ldt, addr, entries); -} -static inline void store_idt(struct desc_ptr *dtr) -{ - PVOP_VCALL1(pv_cpu_ops.store_idt, dtr); -} -static inline unsigned long paravirt_store_tr(void) -{ - return PVOP_CALL0(unsigned long, pv_cpu_ops.store_tr); -} -#define store_tr(tr) ((tr) = paravirt_store_tr()) -static inline void load_TLS(struct thread_struct *t, unsigned cpu) -{ - PVOP_VCALL2(pv_cpu_ops.load_tls, t, cpu); -} - -#ifdef CONFIG_X86_64 -static inline void load_gs_index(unsigned int gs) -{ - PVOP_VCALL1(pv_cpu_ops.load_gs_index, gs); -} -#endif - -static inline void write_ldt_entry(struct desc_struct *dt, int entry, - const void *desc) -{ - PVOP_VCALL3(pv_cpu_ops.write_ldt_entry, dt, entry, desc); -} - -static inline void write_gdt_entry(struct desc_struct *dt, int entry, - void *desc, int type) -{ - PVOP_VCALL4(pv_cpu_ops.write_gdt_entry, dt, entry, desc, type); -} - -static inline void write_idt_entry(gate_desc *dt, int entry, const gate_desc *g) -{ - PVOP_VCALL3(pv_cpu_ops.write_idt_entry, dt, entry, g); -} -static inline void set_iopl_mask(unsigned mask) -{ - PVOP_VCALL1(pv_cpu_ops.set_iopl_mask, mask); -} - /* The paravirtualized I/O functions */ static inline void slow_down_io(void) { @@ -670,17 +461,6 @@ static inline void pmd_clear(pmd_t *pmdp) } #endif /* CONFIG_X86_PAE */ -#define __HAVE_ARCH_START_CONTEXT_SWITCH -static inline void arch_start_context_switch(struct task_struct *prev) -{ - PVOP_VCALL1(pv_cpu_ops.start_context_switch, prev); -} - -static inline void arch_end_context_switch(struct task_struct *next) -{ - PVOP_VCALL1(pv_cpu_ops.end_context_switch, next); -} - #define __HAVE_ARCH_ENTER_LAZY_MMU_MODE static inline void arch_enter_lazy_mmu_mode(void) { @@ -924,10 +704,6 @@ extern void default_banner(void); #include <asm/paravirt_full.h> #endif -#define INTERRUPT_RETURN \ - PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_iret), CLBR_NONE, \ - jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_iret)) - #define DISABLE_INTERRUPTS(clobbers) \ PARA_SITE(PARA_PATCH(pv_irq_ops, PV_IRQ_irq_disable), clobbers, \ PV_SAVE_REGS(clobbers | CLBR_CALLEE_SAVE); \ @@ -940,32 +716,7 @@ extern void default_banner(void); call PARA_INDIRECT(pv_irq_ops+PV_IRQ_irq_enable); \ PV_RESTORE_REGS(clobbers | CLBR_CALLEE_SAVE);) -#ifdef CONFIG_X86_32 -#define GET_CR0_INTO_EAX \ - push %ecx; push %edx; \ - call PARA_INDIRECT(pv_cpu_ops+PV_CPU_read_cr0); \ - pop %edx; pop %ecx -#else /* !CONFIG_X86_32 */ - -/* - * If swapgs is used while the userspace stack is still current, - * there's no way to call a pvop. The PV replacement *must* be - * inlined, or the swapgs instruction must be trapped and emulated. - */ -#define SWAPGS_UNSAFE_STACK \ - PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_swapgs), CLBR_NONE, \ - swapgs) - -/* - * Note: swapgs is very special, and in practise is either going to be - * implemented with a single "swapgs" instruction or something very - * special. Either way, we don't need to save any registers for - * it. - */ -#define SWAPGS \ - PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_swapgs), CLBR_NONE, \ - call PARA_INDIRECT(pv_cpu_ops+PV_CPU_swapgs) \ - ) +#ifdef CONFIG_X86_64 #define GET_CR2_INTO_RAX \ call PARA_INDIRECT(pv_mmu_ops+PV_MMU_read_cr2) @@ -975,11 +726,7 @@ extern void default_banner(void); CLBR_NONE, \ call PARA_INDIRECT(pv_irq_ops+PV_IRQ_adjust_exception_frame)) -#define USERGS_SYSRET64 \ - PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_usergs_sysret64), \ - CLBR_NONE, \ - jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_usergs_sysret64)) -#endif /* CONFIG_X86_32 */ +#endif /* CONFIG_X86_64 */ #endif /* __ASSEMBLY__ */ #else /* CONFIG_PARAVIRT */ diff --git a/arch/x86/include/asm/paravirt_full.h b/arch/x86/include/asm/paravirt_full.h index 1cabcfff6791..b3cf0960c161 100644 --- a/arch/x86/include/asm/paravirt_full.h +++ b/arch/x86/include/asm/paravirt_full.h @@ -1,4 +1,273 @@ #ifndef _ASM_X86_PARAVIRT_FULL_H #define _ASM_X86_PARAVIRT_FULL_H +#ifndef __ASSEMBLY__ + +static inline void load_sp0(struct tss_struct *tss, + struct thread_struct *thread) +{ + PVOP_VCALL2(pvfull_cpu_ops.load_sp0, tss, thread); +} + +/* The paravirtualized CPUID instruction. */ +static inline void __cpuid(unsigned int *eax, unsigned int *ebx, + unsigned int *ecx, unsigned int *edx) +{ + PVOP_VCALL4(pvfull_cpu_ops.cpuid, eax, ebx, ecx, edx); +} + +/* + * These special macros can be used to get or set a debugging register + */ +static inline unsigned long paravirt_get_debugreg(int reg) +{ + return PVOP_CALL1(unsigned long, pvfull_cpu_ops.get_debugreg, reg); +} +#define get_debugreg(var, reg) var = paravirt_get_debugreg(reg) +static inline void set_debugreg(unsigned long val, int reg) +{ + PVOP_VCALL2(pvfull_cpu_ops.set_debugreg, reg, val); +} + +static inline unsigned long read_cr0(void) +{ + return PVOP_CALL0(unsigned long, pvfull_cpu_ops.read_cr0); +} + +static inline void write_cr0(unsigned long x) +{ + PVOP_VCALL1(pvfull_cpu_ops.write_cr0, x); +} + +static inline unsigned long __read_cr4(void) +{ + return PVOP_CALL0(unsigned long, pvfull_cpu_ops.read_cr4); +} + +static inline void __write_cr4(unsigned long x) +{ + PVOP_VCALL1(pvfull_cpu_ops.write_cr4, x); +} + +#ifdef CONFIG_X86_64 +static inline unsigned long read_cr8(void) +{ + return PVOP_CALL0(unsigned long, pvfull_cpu_ops.read_cr8); +} + +static inline void write_cr8(unsigned long x) +{ + PVOP_VCALL1(pvfull_cpu_ops.write_cr8, x); +} +#endif + +static inline void wbinvd(void) +{ + PVOP_VCALL0(pvfull_cpu_ops.wbinvd); +} + +static inline u64 paravirt_read_msr(unsigned msr) +{ + return PVOP_CALL1(u64, pvfull_cpu_ops.read_msr, msr); +} + +static inline void paravirt_write_msr(unsigned msr, + unsigned low, unsigned high) +{ + return PVOP_VCALL3(pvfull_cpu_ops.write_msr, msr, low, high); +} + +static inline u64 paravirt_read_msr_safe(unsigned msr, int *err) +{ + return PVOP_CALL2(u64, pvfull_cpu_ops.read_msr_safe, msr, err); +} + +static inline int paravirt_write_msr_safe(unsigned msr, + unsigned low, unsigned high) +{ + return PVOP_CALL3(int, pvfull_cpu_ops.write_msr_safe, msr, low, high); +} + +#define rdmsr(msr, val1, val2) \ +do { \ + u64 _l = paravirt_read_msr(msr); \ + val1 = (u32)_l; \ + val2 = _l >> 32; \ +} while (0) + +#define wrmsr(msr, val1, val2) \ +do { \ + paravirt_write_msr(msr, val1, val2); \ +} while (0) + +#define rdmsrl(msr, val) \ +do { \ + val = paravirt_read_msr(msr); \ +} while (0) + +static inline void wrmsrl(unsigned msr, u64 val) +{ + wrmsr(msr, (u32)val, (u32)(val>>32)); +} + +#define wrmsr_safe(msr, a, b) paravirt_write_msr_safe(msr, a, b) + +/* rdmsr with exception handling */ +#define rdmsr_safe(msr, a, b) \ +({ \ + int _err; \ + u64 _l = paravirt_read_msr_safe(msr, &_err); \ + (*a) = (u32)_l; \ + (*b) = _l >> 32; \ + _err; \ +}) + +static inline int rdmsrl_safe(unsigned msr, unsigned long long *p) +{ + int err; + + *p = paravirt_read_msr_safe(msr, &err); + return err; +} + +static inline unsigned long long paravirt_read_pmc(int counter) +{ + return PVOP_CALL1(u64, pvfull_cpu_ops.read_pmc, counter); +} + +#define rdpmc(counter, low, high) \ +do { \ + u64 _l = paravirt_read_pmc(counter); \ + low = (u32)_l; \ + high = _l >> 32; \ +} while (0) + +#define rdpmcl(counter, val) ((val) = paravirt_read_pmc(counter)) + +static inline void paravirt_alloc_ldt(struct desc_struct *ldt, unsigned entries) +{ + PVOP_VCALL2(pvfull_cpu_ops.alloc_ldt, ldt, entries); +} + +static inline void paravirt_free_ldt(struct desc_struct *ldt, unsigned entries) +{ + PVOP_VCALL2(pvfull_cpu_ops.free_ldt, ldt, entries); +} + +static inline void load_TR_desc(void) +{ + PVOP_VCALL0(pvfull_cpu_ops.load_tr_desc); +} + +static inline void load_gdt(const struct desc_ptr *dtr) +{ + PVOP_VCALL1(pvfull_cpu_ops.load_gdt, dtr); +} + +static inline void load_idt(const struct desc_ptr *dtr) +{ + PVOP_VCALL1(pvfull_cpu_ops.load_idt, dtr); +} + +static inline void set_ldt(const void *addr, unsigned entries) +{ + PVOP_VCALL2(pvfull_cpu_ops.set_ldt, addr, entries); +} + +static inline void store_idt(struct desc_ptr *dtr) +{ + PVOP_VCALL1(pvfull_cpu_ops.store_idt, dtr); +} + +static inline unsigned long paravirt_store_tr(void) +{ + return PVOP_CALL0(unsigned long, pvfull_cpu_ops.store_tr); +} + +#define store_tr(tr) ((tr) = paravirt_store_tr()) + +static inline void load_TLS(struct thread_struct *t, unsigned cpu) +{ + PVOP_VCALL2(pvfull_cpu_ops.load_tls, t, cpu); +} + +#ifdef CONFIG_X86_64 +static inline void load_gs_index(unsigned int gs) +{ + PVOP_VCALL1(pvfull_cpu_ops.load_gs_index, gs); +} +#endif + +static inline void write_ldt_entry(struct desc_struct *dt, int entry, + const void *desc) +{ + PVOP_VCALL3(pvfull_cpu_ops.write_ldt_entry, dt, entry, desc); +} + +static inline void write_gdt_entry(struct desc_struct *dt, int entry, + void *desc, int type) +{ + PVOP_VCALL4(pvfull_cpu_ops.write_gdt_entry, dt, entry, desc, type); +} + +static inline void write_idt_entry(gate_desc *dt, int entry, const gate_desc *g) +{ + PVOP_VCALL3(pvfull_cpu_ops.write_idt_entry, dt, entry, g); +} + +static inline void set_iopl_mask(unsigned mask) +{ + PVOP_VCALL1(pvfull_cpu_ops.set_iopl_mask, mask); +} + +#define __HAVE_ARCH_START_CONTEXT_SWITCH +static inline void arch_start_context_switch(struct task_struct *prev) +{ + PVOP_VCALL1(pvfull_cpu_ops.start_context_switch, prev); +} + +static inline void arch_end_context_switch(struct task_struct *next) +{ + PVOP_VCALL1(pvfull_cpu_ops.end_context_switch, next); +} + +#else /* __ASSEMBLY__ */ + +#define INTERRUPT_RETURN \ + PARA_SITE(PARA_PATCH(pvfull_cpu_ops, PV_CPU_iret), CLBR_NONE, \ + jmp PARA_INDIRECT(pvfull_cpu_ops+PV_CPU_iret)) + +#ifdef CONFIG_X86_32 +#define GET_CR0_INTO_EAX \ + push %ecx; push %edx; \ + call PARA_INDIRECT(pvfull_cpu_ops+PV_CPU_read_cr0); \ + pop %edx; pop %ecx +#else /* !CONFIG_X86_32 */ + +/* + * If swapgs is used while the userspace stack is still current, + * there's no way to call a pvop. The PV replacement *must* be + * inlined, or the swapgs instruction must be trapped and emulated. + */ +#define SWAPGS_UNSAFE_STACK \ + PARA_SITE(PARA_PATCH(pvfull_cpu_ops, PV_CPU_swapgs), CLBR_NONE, \ + swapgs) + +/* + * Note: swapgs is very special, and in practise is either going to be + * implemented with a single "swapgs" instruction or something very + * special. Either way, we don't need to save any registers for + * it. + */ +#define SWAPGS \ + PARA_SITE(PARA_PATCH(pvfull_cpu_ops, PV_CPU_swapgs), CLBR_NONE, \ + call PARA_INDIRECT(pvfull_cpu_ops+PV_CPU_swapgs)) + +#define USERGS_SYSRET64 \ + PARA_SITE(PARA_PATCH(pvfull_cpu_ops, PV_CPU_usergs_sysret64), \ + CLBR_NONE, \ + jmp PARA_INDIRECT(pvfull_cpu_ops+PV_CPU_usergs_sysret64)) +#endif /* CONFIG_X86_32 */ + +#endif /* __ASSEMBLY__ */ #endif /* _ASM_X86_PARAVIRT_FULL_H */ diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h index dbb0e69cd5c6..e0fb1291bbdb 100644 --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -103,82 +103,7 @@ struct pv_time_ops { }; struct pv_cpu_ops { - /* hooks for various privileged instructions */ - unsigned long (*get_debugreg)(int regno); - void (*set_debugreg)(int regno, unsigned long value); - - unsigned long (*read_cr0)(void); - void (*write_cr0)(unsigned long); - - unsigned long (*read_cr4)(void); - void (*write_cr4)(unsigned long); - -#ifdef CONFIG_X86_64 - unsigned long (*read_cr8)(void); - void (*write_cr8)(unsigned long); -#endif - - /* Segment descriptor handling */ - void (*load_tr_desc)(void); - void (*load_gdt)(const struct desc_ptr *); - void (*load_idt)(const struct desc_ptr *); - /* store_gdt has been removed. */ - void (*store_idt)(struct desc_ptr *); - void (*set_ldt)(const void *desc, unsigned entries); - unsigned long (*store_tr)(void); - void (*load_tls)(struct thread_struct *t, unsigned int cpu); -#ifdef CONFIG_X86_64 - void (*load_gs_index)(unsigned int idx); -#endif - void (*write_ldt_entry)(struct desc_struct *ldt, int entrynum, - const void *desc); - void (*write_gdt_entry)(struct desc_struct *, - int entrynum, const void *desc, int size); - void (*write_idt_entry)(gate_desc *, - int entrynum, const gate_desc *gate); - void (*alloc_ldt)(struct desc_struct *ldt, unsigned entries); - void (*free_ldt)(struct desc_struct *ldt, unsigned entries); - - void (*load_sp0)(struct tss_struct *tss, struct thread_struct *t); - - void (*set_iopl_mask)(unsigned mask); - - void (*wbinvd)(void); void (*io_delay)(void); - - /* cpuid emulation, mostly so that caps bits can be disabled */ - void (*cpuid)(unsigned int *eax, unsigned int *ebx, - unsigned int *ecx, unsigned int *edx); - - /* Unsafe MSR operations. These will warn or panic on failure. */ - u64 (*read_msr)(unsigned int msr); - void (*write_msr)(unsigned int msr, unsigned low, unsigned high); - - /* - * Safe MSR operations. - * read sets err to 0 or -EIO. write returns 0 or -EIO. - */ - u64 (*read_msr_safe)(unsigned int msr, int *err); - int (*write_msr_safe)(unsigned int msr, unsigned low, unsigned high); - - u64 (*read_pmc)(int counter); - - /* - * Switch to usermode gs and return to 64-bit usermode using - * sysret. Only used in 64-bit kernels to return to 64-bit - * processes. Usermode register state, including %rsp, must - * already be restored. - */ - void (*usergs_sysret64)(void); - - /* Normal iret. Jump to this with the standard iret stack - frame set up. */ - void (*iret)(void); - - void (*swapgs)(void); - - void (*start_context_switch)(struct task_struct *prev); - void (*end_context_switch)(struct task_struct *next); }; struct pv_irq_ops { @@ -339,6 +264,9 @@ struct paravirt_patch_template { struct pv_irq_ops pv_irq_ops; struct pv_mmu_ops pv_mmu_ops; struct pv_lock_ops pv_lock_ops; +#ifdef CONFIG_PARAVIRT_FULL + struct pvfull_cpu_ops pvfull_cpu_ops; +#endif }; extern struct pv_info pv_info; diff --git a/arch/x86/include/asm/paravirt_types_full.h b/arch/x86/include/asm/paravirt_types_full.h index 69c048324e70..50635628f6e8 100644 --- a/arch/x86/include/asm/paravirt_types_full.h +++ b/arch/x86/include/asm/paravirt_types_full.h @@ -1,4 +1,82 @@ #ifndef _ASM_X86_PARAVIRT_TYPES_FULL_H #define _ASM_X86_PARAVIRT_TYPES_FULL_H +struct pvfull_cpu_ops { + /* hooks for various privileged instructions */ + unsigned long (*get_debugreg)(int regno); + void (*set_debugreg)(int regno, unsigned long value); + + unsigned long (*read_cr0)(void); + void (*write_cr0)(unsigned long); + + unsigned long (*read_cr4)(void); + void (*write_cr4)(unsigned long); + +#ifdef CONFIG_X86_64 + unsigned long (*read_cr8)(void); + void (*write_cr8)(unsigned long); +#endif + + /* Segment descriptor handling */ + void (*load_tr_desc)(void); + void (*load_gdt)(const struct desc_ptr *); + void (*load_idt)(const struct desc_ptr *); + void (*store_idt)(struct desc_ptr *); + void (*set_ldt)(const void *desc, unsigned entries); + unsigned long (*store_tr)(void); + void (*load_tls)(struct thread_struct *t, unsigned int cpu); +#ifdef CONFIG_X86_64 + void (*load_gs_index)(unsigned int idx); +#endif + void (*write_ldt_entry)(struct desc_struct *ldt, int entrynum, + const void *desc); + void (*write_gdt_entry)(struct desc_struct *, + int entrynum, const void *desc, int size); + void (*write_idt_entry)(gate_desc *, + int entrynum, const gate_desc *gate); + void (*alloc_ldt)(struct desc_struct *ldt, unsigned entries); + void (*free_ldt)(struct desc_struct *ldt, unsigned entries); + + void (*load_sp0)(struct tss_struct *tss, struct thread_struct *t); + + void (*set_iopl_mask)(unsigned mask); + + void (*wbinvd)(void); + + /* cpuid emulation, mostly so that caps bits can be disabled */ + void (*cpuid)(unsigned int *eax, unsigned int *ebx, + unsigned int *ecx, unsigned int *edx); + + /* Unsafe MSR operations. These will warn or panic on failure. */ + u64 (*read_msr)(unsigned int msr); + void (*write_msr)(unsigned int msr, unsigned low, unsigned high); + + /* + * Safe MSR operations. + * read sets err to 0 or -EIO. write returns 0 or -EIO. + */ + u64 (*read_msr_safe)(unsigned int msr, int *err); + int (*write_msr_safe)(unsigned int msr, unsigned low, unsigned high); + + u64 (*read_pmc)(int counter); + + /* + * Switch to usermode gs and return to 64-bit usermode using + * sysret. Only used in 64-bit kernels to return to 64-bit + * processes. Usermode register state, including %rsp, must + * already be restored. + */ + void (*usergs_sysret64)(void); + + /* Normal iret. Jump to this with the std iret stack frame set up. */ + void (*iret)(void); + + void (*swapgs)(void); + + void (*start_context_switch)(struct task_struct *prev); + void (*end_context_switch)(struct task_struct *next); +}; + +extern struct pvfull_cpu_ops pvfull_cpu_ops; + #endif /* _ASM_X86_PARAVIRT_TYPES_FULL_H */ diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index f5af95a0c6b8..fad12c481bf9 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -98,10 +98,14 @@ extern struct mm_struct *pgd_page_get_mm(struct page *page); #define pte_val(x) native_pte_val(x) #define __pte(x) native_make_pte(x) -#define arch_end_context_switch(prev) do {} while(0) - #endif /* CONFIG_PARAVIRT */ +#ifndef CONFIG_PARAVIRT_FULL + +#define arch_end_context_switch(prev) do {} while (0) + +#endif /* CONFIG_PARAVIRT_FULL */ + /* * The following only work if pte_present() is true. * Undefined behaviour if not.. diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 3cada998a402..9592c47f52bc 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -531,7 +531,7 @@ static inline unsigned long current_top_of_stack(void) #endif } -#ifdef CONFIG_PARAVIRT +#ifdef CONFIG_PARAVIRT_FULL #include <asm/paravirt.h> #else #define __cpuid native_cpuid @@ -543,7 +543,7 @@ static inline void load_sp0(struct tss_struct *tss, } #define set_iopl_mask native_set_iopl_mask -#endif /* CONFIG_PARAVIRT */ +#endif /* CONFIG_PARAVIRT_FULL */ /* Free all resources held by a thread. */ extern void release_thread(struct task_struct *); diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index 12af3e35edfa..ca3a3103791d 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -139,16 +139,6 @@ extern asmlinkage void native_load_gs_index(unsigned); #include <asm/paravirt.h> #else -static inline unsigned long read_cr0(void) -{ - return native_read_cr0(); -} - -static inline void write_cr0(unsigned long x) -{ - native_write_cr0(x); -} - static inline unsigned long read_cr2(void) { return native_read_cr2(); @@ -169,6 +159,20 @@ static inline void write_cr3(unsigned long x) native_write_cr3(x); } +#endif/* CONFIG_PARAVIRT */ + +#ifndef CONFIG_PARAVIRT_FULL + +static inline unsigned long read_cr0(void) +{ + return native_read_cr0(); +} + +static inline void write_cr0(unsigned long x) +{ + native_write_cr0(x); +} + static inline unsigned long __read_cr4(void) { return native_read_cr4(); @@ -203,7 +207,8 @@ static inline void load_gs_index(unsigned selector) #endif -#endif/* CONFIG_PARAVIRT */ +#endif /* CONFIG_PARAVIRT_FULL */ + static inline void clflush(volatile void *__p) { diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c index de827d6ac8c2..7b393e453333 100644 --- a/arch/x86/kernel/asm-offsets.c +++ b/arch/x86/kernel/asm-offsets.c @@ -64,14 +64,17 @@ void common(void) { #ifdef CONFIG_PARAVIRT BLANK(); - OFFSET(PARAVIRT_PATCH_pv_cpu_ops, paravirt_patch_template, pv_cpu_ops); OFFSET(PARAVIRT_PATCH_pv_irq_ops, paravirt_patch_template, pv_irq_ops); OFFSET(PV_IRQ_irq_disable, pv_irq_ops, irq_disable); OFFSET(PV_IRQ_irq_enable, pv_irq_ops, irq_enable); - OFFSET(PV_CPU_iret, pv_cpu_ops, iret); - OFFSET(PV_CPU_read_cr0, pv_cpu_ops, read_cr0); OFFSET(PV_MMU_read_cr2, pv_mmu_ops, read_cr2); #endif +#ifdef CONFIG_PARAVIRT_FULL + OFFSET(PARAVIRT_PATCH_pvfull_cpu_ops, paravirt_patch_template, + pvfull_cpu_ops); + OFFSET(PV_CPU_iret, pvfull_cpu_ops, iret); + OFFSET(PV_CPU_read_cr0, pvfull_cpu_ops, read_cr0); +#endif #ifdef CONFIG_XEN BLANK(); diff --git a/arch/x86/kernel/asm-offsets_64.c b/arch/x86/kernel/asm-offsets_64.c index 99332f550c48..f4fe7d9ac0d9 100644 --- a/arch/x86/kernel/asm-offsets_64.c +++ b/arch/x86/kernel/asm-offsets_64.c @@ -21,8 +21,10 @@ int main(void) { #ifdef CONFIG_PARAVIRT OFFSET(PV_IRQ_adjust_exception_frame, pv_irq_ops, adjust_exception_frame); - OFFSET(PV_CPU_usergs_sysret64, pv_cpu_ops, usergs_sysret64); - OFFSET(PV_CPU_swapgs, pv_cpu_ops, swapgs); +#endif +#ifdef CONFIG_PARAVIRT_FULL + OFFSET(PV_CPU_usergs_sysret64, pvfull_cpu_ops, usergs_sysret64); + OFFSET(PV_CPU_swapgs, pvfull_cpu_ops, swapgs); BLANK(); #endif diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index c8b39870f33e..53081df88420 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1017,10 +1017,10 @@ static void generic_identify(struct cpuinfo_x86 *c) * ESPFIX issue, we can change this. */ #ifdef CONFIG_X86_32 -# ifdef CONFIG_PARAVIRT +# ifdef CONFIG_PARAVIRT_FULL do { extern void native_iret(void); - if (pv_cpu_ops.iret == native_iret) + if (pvfull_cpu_ops.iret == native_iret) set_cpu_bug(c, X86_BUG_ESPFIX); } while (0); # else diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index b8b23b3f24c2..6b90de65479e 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -24,12 +24,9 @@ #include <linux/efi.h> #include <linux/bcd.h> #include <linux/highmem.h> -#include <linux/kprobes.h> #include <asm/bug.h> #include <asm/paravirt.h> -#include <asm/debugreg.h> -#include <asm/desc.h> #include <asm/setup.h> #include <asm/pgtable.h> #include <asm/time.h> @@ -128,6 +125,9 @@ static void *get_call_destination(u8 type) #ifdef CONFIG_PARAVIRT_SPINLOCKS .pv_lock_ops = pv_lock_ops, #endif +#ifdef CONFIG_PARAVIRT_FULL + .pvfull_cpu_ops = pvfull_cpu_ops, +#endif }; return *((void **)&tmpl + type); } @@ -150,10 +150,12 @@ unsigned paravirt_patch_default(u8 type, u16 clobbers, void *insnbuf, else if (opfunc == _paravirt_ident_64) ret = paravirt_patch_ident_64(insnbuf, len); - else if (type == PARAVIRT_PATCH(pv_cpu_ops.iret) || - type == PARAVIRT_PATCH(pv_cpu_ops.usergs_sysret64)) +#ifdef CONFIG_PARAVIRT_FULL + else if (type == PARAVIRT_PATCH(pvfull_cpu_ops.iret) || + type == PARAVIRT_PATCH(pvfull_cpu_ops.usergs_sysret64)) /* If operation requires a jmp, then jmp */ ret = paravirt_patch_jmp(insnbuf, opfunc, addr, len); +#endif else /* Otherwise call the function; assume target could clobber any caller-save reg */ @@ -203,10 +205,6 @@ static u64 native_steal_clock(int cpu) return 0; } -/* These are in entry.S */ -extern void native_iret(void); -extern void native_usergs_sysret64(void); - static DEFINE_PER_CPU(enum paravirt_lazy_mode, paravirt_lazy_mode) = PARAVIRT_LAZY_NONE; static inline void enter_lazy(enum paravirt_lazy_mode mode) @@ -306,60 +304,9 @@ __visible struct pv_irq_ops pv_irq_ops = { }; __visible struct pv_cpu_ops pv_cpu_ops = { - .cpuid = native_cpuid, - .get_debugreg = native_get_debugreg, - .set_debugreg = native_set_debugreg, - .read_cr0 = native_read_cr0, - .write_cr0 = native_write_cr0, - .read_cr4 = native_read_cr4, - .write_cr4 = native_write_cr4, -#ifdef CONFIG_X86_64 - .read_cr8 = native_read_cr8, - .write_cr8 = native_write_cr8, -#endif - .wbinvd = native_wbinvd, - .read_msr = native_read_msr, - .write_msr = native_write_msr, - .read_msr_safe = native_read_msr_safe, - .write_msr_safe = native_write_msr_safe, - .read_pmc = native_read_pmc, - .load_tr_desc = native_load_tr_desc, - .set_ldt = native_set_ldt, - .load_gdt = native_load_gdt, - .load_idt = native_load_idt, - .store_idt = native_store_idt, - .store_tr = native_store_tr, - .load_tls = native_load_tls, -#ifdef CONFIG_X86_64 - .load_gs_index = native_load_gs_index, -#endif - .write_ldt_entry = native_write_ldt_entry, - .write_gdt_entry = native_write_gdt_entry, - .write_idt_entry = native_write_idt_entry, - - .alloc_ldt = paravirt_nop, - .free_ldt = paravirt_nop, - - .load_sp0 = native_load_sp0, - -#ifdef CONFIG_X86_64 - .usergs_sysret64 = native_usergs_sysret64, -#endif - .iret = native_iret, - .swapgs = native_swapgs, - - .set_iopl_mask = native_set_iopl_mask, .io_delay = native_io_delay, - - .start_context_switch = paravirt_nop, - .end_context_switch = paravirt_nop, }; -/* At this point, native_get/set_debugreg has real function entries */ -NOKPROBE_SYMBOL(native_get_debugreg); -NOKPROBE_SYMBOL(native_set_debugreg); -NOKPROBE_SYMBOL(native_load_idt); - #if defined(CONFIG_X86_32) && !defined(CONFIG_X86_PAE) /* 32-bit pagetable entries */ #define PTE_IDENT __PV_IS_CALLEE_SAVE(_paravirt_ident_32) diff --git a/arch/x86/kernel/paravirt_full.c b/arch/x86/kernel/paravirt_full.c index 0c7de64129c5..9b8708421cd2 100644 --- a/arch/x86/kernel/paravirt_full.c +++ b/arch/x86/kernel/paravirt_full.c @@ -13,4 +13,70 @@ GNU General Public License for more details. */ +#include <linux/percpu.h> +#include <linux/kprobes.h> + #include <asm/paravirt.h> +#include <asm/debugreg.h> +#include <asm/desc.h> +#include <asm/processor.h> + +/* These are in entry.S */ +extern void native_iret(void); +extern void native_usergs_sysret64(void); + +__visible struct pvfull_cpu_ops pvfull_cpu_ops = { + .cpuid = native_cpuid, + .get_debugreg = native_get_debugreg, + .set_debugreg = native_set_debugreg, + .read_cr0 = native_read_cr0, + .write_cr0 = native_write_cr0, + .read_cr4 = native_read_cr4, + .write_cr4 = native_write_cr4, +#ifdef CONFIG_X86_64 + .read_cr8 = native_read_cr8, + .write_cr8 = native_write_cr8, +#endif + .wbinvd = native_wbinvd, + .read_msr = native_read_msr, + .write_msr = native_write_msr, + .read_msr_safe = native_read_msr_safe, + .write_msr_safe = native_write_msr_safe, + .read_pmc = native_read_pmc, + .load_tr_desc = native_load_tr_desc, + .set_ldt = native_set_ldt, + .load_gdt = native_load_gdt, + .load_idt = native_load_idt, + .store_idt = native_store_idt, + .store_tr = native_store_tr, + .load_tls = native_load_tls, +#ifdef CONFIG_X86_64 + .load_gs_index = native_load_gs_index, +#endif + .write_ldt_entry = native_write_ldt_entry, + .write_gdt_entry = native_write_gdt_entry, + .write_idt_entry = native_write_idt_entry, + + .alloc_ldt = paravirt_nop, + .free_ldt = paravirt_nop, + + .load_sp0 = native_load_sp0, + +#ifdef CONFIG_X86_64 + .usergs_sysret64 = native_usergs_sysret64, +#endif + .iret = native_iret, + .swapgs = native_swapgs, + + .set_iopl_mask = native_set_iopl_mask, + + .start_context_switch = paravirt_nop, + .end_context_switch = paravirt_nop, +}; + +/* At this point, native_get/set_debugreg has real function entries */ +NOKPROBE_SYMBOL(native_get_debugreg); +NOKPROBE_SYMBOL(native_set_debugreg); +NOKPROBE_SYMBOL(native_load_idt); + +EXPORT_SYMBOL(pvfull_cpu_ops); diff --git a/arch/x86/kernel/paravirt_patch_32.c b/arch/x86/kernel/paravirt_patch_32.c index 553acbbb4d32..ccb75951aed5 100644 --- a/arch/x86/kernel/paravirt_patch_32.c +++ b/arch/x86/kernel/paravirt_patch_32.c @@ -4,10 +4,12 @@ DEF_NATIVE(pv_irq_ops, irq_disable, "cli"); DEF_NATIVE(pv_irq_ops, irq_enable, "sti"); DEF_NATIVE(pv_irq_ops, restore_fl, "push %eax; popf"); DEF_NATIVE(pv_irq_ops, save_fl, "pushf; pop %eax"); -DEF_NATIVE(pv_cpu_ops, iret, "iret"); DEF_NATIVE(pv_mmu_ops, read_cr2, "mov %cr2, %eax"); DEF_NATIVE(pv_mmu_ops, write_cr3, "mov %eax, %cr3"); DEF_NATIVE(pv_mmu_ops, read_cr3, "mov %cr3, %eax"); +#ifdef CONFIG_PARAVIRT_FULL +DEF_NATIVE(pvfull_cpu_ops, iret, "iret"); +#endif #if defined(CONFIG_PARAVIRT_SPINLOCKS) DEF_NATIVE(pv_lock_ops, queued_spin_unlock, "movb $0, (%eax)"); @@ -45,10 +47,12 @@ unsigned native_patch(u8 type, u16 clobbers, void *ibuf, PATCH_SITE(pv_irq_ops, irq_enable); PATCH_SITE(pv_irq_ops, restore_fl); PATCH_SITE(pv_irq_ops, save_fl); - PATCH_SITE(pv_cpu_ops, iret); PATCH_SITE(pv_mmu_ops, read_cr2); PATCH_SITE(pv_mmu_ops, read_cr3); PATCH_SITE(pv_mmu_ops, write_cr3); +#ifdef CONFIG_PARAVIRT_FULL + PATCH_SITE(pvfull_cpu_ops, iret); +#endif #if defined(CONFIG_PARAVIRT_SPINLOCKS) case PARAVIRT_PATCH(pv_lock_ops.queued_spin_unlock): if (pv_is_native_spin_unlock()) { diff --git a/arch/x86/kernel/paravirt_patch_64.c b/arch/x86/kernel/paravirt_patch_64.c index 11aaf1eaa0e4..00d5c77d23a7 100644 --- a/arch/x86/kernel/paravirt_patch_64.c +++ b/arch/x86/kernel/paravirt_patch_64.c @@ -10,14 +10,16 @@ DEF_NATIVE(pv_mmu_ops, read_cr2, "movq %cr2, %rax"); DEF_NATIVE(pv_mmu_ops, read_cr3, "movq %cr3, %rax"); DEF_NATIVE(pv_mmu_ops, write_cr3, "movq %rdi, %cr3"); DEF_NATIVE(pv_mmu_ops, flush_tlb_single, "invlpg (%rdi)"); -DEF_NATIVE(pv_cpu_ops, wbinvd, "wbinvd"); - -DEF_NATIVE(pv_cpu_ops, usergs_sysret64, "swapgs; sysretq"); -DEF_NATIVE(pv_cpu_ops, swapgs, "swapgs"); DEF_NATIVE(, mov32, "mov %edi, %eax"); DEF_NATIVE(, mov64, "mov %rdi, %rax"); +#ifdef CONFIG_PARAVIRT_FULL +DEF_NATIVE(pvfull_cpu_ops, wbinvd, "wbinvd"); +DEF_NATIVE(pvfull_cpu_ops, usergs_sysret64, "swapgs; sysretq"); +DEF_NATIVE(pvfull_cpu_ops, swapgs, "swapgs"); +#endif + #if defined(CONFIG_PARAVIRT_SPINLOCKS) DEF_NATIVE(pv_lock_ops, queued_spin_unlock, "movb $0, (%rdi)"); DEF_NATIVE(pv_lock_ops, vcpu_is_preempted, "xor %rax, %rax"); @@ -54,13 +56,15 @@ unsigned native_patch(u8 type, u16 clobbers, void *ibuf, PATCH_SITE(pv_irq_ops, save_fl); PATCH_SITE(pv_irq_ops, irq_enable); PATCH_SITE(pv_irq_ops, irq_disable); - PATCH_SITE(pv_cpu_ops, usergs_sysret64); - PATCH_SITE(pv_cpu_ops, swapgs); PATCH_SITE(pv_mmu_ops, read_cr2); PATCH_SITE(pv_mmu_ops, read_cr3); PATCH_SITE(pv_mmu_ops, write_cr3); PATCH_SITE(pv_mmu_ops, flush_tlb_single); - PATCH_SITE(pv_cpu_ops, wbinvd); +#ifdef CONFIG_PARAVIRT_FULL + PATCH_SITE(pvfull_cpu_ops, usergs_sysret64); + PATCH_SITE(pvfull_cpu_ops, swapgs); + PATCH_SITE(pvfull_cpu_ops, wbinvd); +#endif #if defined(CONFIG_PARAVIRT_SPINLOCKS) case PARAVIRT_PATCH(pv_lock_ops.queued_spin_unlock): if (pv_is_native_spin_unlock()) { diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c index 99472698c931..fa79dbe220ad 100644 --- a/arch/x86/lguest/boot.c +++ b/arch/x86/lguest/boot.c @@ -1410,25 +1410,25 @@ __init void lguest_init(void) pv_init_ops.patch = lguest_patch; /* Intercepts of various CPU instructions */ - pv_cpu_ops.load_gdt = lguest_load_gdt; - pv_cpu_ops.cpuid = lguest_cpuid; - pv_cpu_ops.load_idt = lguest_load_idt; - pv_cpu_ops.iret = lguest_iret; - pv_cpu_ops.load_sp0 = lguest_load_sp0; - pv_cpu_ops.load_tr_desc = lguest_load_tr_desc; - pv_cpu_ops.set_ldt = lguest_set_ldt; - pv_cpu_ops.load_tls = lguest_load_tls; - pv_cpu_ops.get_debugreg = lguest_get_debugreg; - pv_cpu_ops.set_debugreg = lguest_set_debugreg; - pv_cpu_ops.read_cr0 = lguest_read_cr0; - pv_cpu_ops.write_cr0 = lguest_write_cr0; - pv_cpu_ops.read_cr4 = lguest_read_cr4; - pv_cpu_ops.write_cr4 = lguest_write_cr4; - pv_cpu_ops.write_gdt_entry = lguest_write_gdt_entry; - pv_cpu_ops.write_idt_entry = lguest_write_idt_entry; - pv_cpu_ops.wbinvd = lguest_wbinvd; - pv_cpu_ops.start_context_switch = paravirt_start_context_switch; - pv_cpu_ops.end_context_switch = lguest_end_context_switch; + pvfull_cpu_ops.load_gdt = lguest_load_gdt; + pvfull_cpu_ops.cpuid = lguest_cpuid; + pvfull_cpu_ops.load_idt = lguest_load_idt; + pvfull_cpu_ops.iret = lguest_iret; + pvfull_cpu_ops.load_sp0 = lguest_load_sp0; + pvfull_cpu_ops.load_tr_desc = lguest_load_tr_desc; + pvfull_cpu_ops.set_ldt = lguest_set_ldt; + pvfull_cpu_ops.load_tls = lguest_load_tls; + pvfull_cpu_ops.get_debugreg = lguest_get_debugreg; + pvfull_cpu_ops.set_debugreg = lguest_set_debugreg; + pvfull_cpu_ops.read_cr0 = lguest_read_cr0; + pvfull_cpu_ops.write_cr0 = lguest_write_cr0; + pvfull_cpu_ops.read_cr4 = lguest_read_cr4; + pvfull_cpu_ops.write_cr4 = lguest_write_cr4; + pvfull_cpu_ops.write_gdt_entry = lguest_write_gdt_entry; + pvfull_cpu_ops.write_idt_entry = lguest_write_idt_entry; + pvfull_cpu_ops.wbinvd = lguest_wbinvd; + pvfull_cpu_ops.start_context_switch = paravirt_start_context_switch; + pvfull_cpu_ops.end_context_switch = lguest_end_context_switch; /* Pagetable management */ pv_mmu_ops.write_cr3 = lguest_write_cr3; diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c index 7cd442690f9d..89cd5cc5f1a2 100644 --- a/arch/x86/xen/enlighten_pv.c +++ b/arch/x86/xen/enlighten_pv.c @@ -1072,7 +1072,7 @@ static const struct pv_init_ops xen_init_ops __initconst = { .patch = xen_patch, }; -static const struct pv_cpu_ops xen_cpu_ops __initconst = { +static const struct pvfull_cpu_ops xen_cpu_ops __initconst = { .cpuid = xen_cpuid, .set_debugreg = xen_set_debugreg, @@ -1125,7 +1125,6 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = { .load_sp0 = xen_load_sp0, .set_iopl_mask = xen_set_iopl_mask, - .io_delay = xen_io_delay, /* Xen takes care of %gs when switching to usermode for us */ .swapgs = paravirt_nop, @@ -1236,14 +1235,14 @@ static void __init xen_boot_params_init_edd(void) */ static void xen_setup_gdt(int cpu) { - pv_cpu_ops.write_gdt_entry = xen_write_gdt_entry_boot; - pv_cpu_ops.load_gdt = xen_load_gdt_boot; + pvfull_cpu_ops.write_gdt_entry = xen_write_gdt_entry_boot; + pvfull_cpu_ops.load_gdt = xen_load_gdt_boot; setup_stack_canary_segment(0); switch_to_new_gdt(0); - pv_cpu_ops.write_gdt_entry = xen_write_gdt_entry; - pv_cpu_ops.load_gdt = xen_load_gdt; + pvfull_cpu_ops.write_gdt_entry = xen_write_gdt_entry; + pvfull_cpu_ops.load_gdt = xen_load_gdt; } static void __init xen_dom0_set_legacy_features(void) @@ -1270,7 +1269,8 @@ asmlinkage __visible void __init xen_start_kernel(void) /* Install Xen paravirt ops */ pv_info = xen_info; pv_init_ops = xen_init_ops; - pv_cpu_ops = xen_cpu_ops; + pvfull_cpu_ops = xen_cpu_ops; + pv_cpu_ops.io_delay = xen_io_delay; x86_platform.get_nmi_reason = xen_get_nmi_reason; -- 2.12.0
Juergen Gross
2017-May-19 15:47 UTC
[PATCH 07/10] paravirt: split pv_irq_ops for support of PARAVIRT_FULL
Move functions needed for fully paravirtualized guests only into a new structure pvfull_irq_ops in paravirt_types_full.h, paravirt_full.h and the associated vector into paravirt_full.c. Signed-off-by: Juergen Gross <jgross at suse.com> --- arch/x86/include/asm/irqflags.h | 44 +++++++++++++++--------------- arch/x86/include/asm/paravirt.h | 15 ---------- arch/x86/include/asm/paravirt_full.h | 17 ++++++++++++ arch/x86/include/asm/paravirt_types.h | 8 +----- arch/x86/include/asm/paravirt_types_full.h | 10 +++++++ arch/x86/kernel/asm-offsets.c | 2 ++ arch/x86/kernel/asm-offsets_64.c | 5 ++-- arch/x86/kernel/paravirt.c | 6 +--- arch/x86/kernel/paravirt_full.c | 9 ++++++ arch/x86/lguest/boot.c | 2 +- arch/x86/xen/irq.c | 3 ++ 11 files changed, 68 insertions(+), 53 deletions(-) diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h index c3319c20127c..2a6d7a675271 100644 --- a/arch/x86/include/asm/irqflags.h +++ b/arch/x86/include/asm/irqflags.h @@ -87,6 +87,26 @@ static inline notrace void arch_local_irq_enable(void) } /* + * For spinlocks, etc: + */ +static inline notrace unsigned long arch_local_irq_save(void) +{ + unsigned long flags = arch_local_save_flags(); + arch_local_irq_disable(); + return flags; +} +#else + +#define ENABLE_INTERRUPTS(x) sti +#define DISABLE_INTERRUPTS(x) cli + +#endif /* __ASSEMBLY__ */ +#endif /* CONFIG_PARAVIRT */ + +#ifndef CONFIG_PARAVIRT_FULL +#ifndef __ASSEMBLY__ + +/* * Used in the idle loop; sti takes one instruction cycle * to complete: */ @@ -104,30 +124,8 @@ static inline __cpuidle void halt(void) native_halt(); } -/* - * For spinlocks, etc: - */ -static inline notrace unsigned long arch_local_irq_save(void) -{ - unsigned long flags = arch_local_save_flags(); - arch_local_irq_disable(); - return flags; -} #else -#define ENABLE_INTERRUPTS(x) sti -#define DISABLE_INTERRUPTS(x) cli - -#ifdef CONFIG_X86_64 -#define PARAVIRT_ADJUST_EXCEPTION_FRAME /* */ -#endif - -#endif /* __ASSEMBLY__ */ -#endif /* CONFIG_PARAVIRT */ - -#ifndef CONFIG_PARAVIRT_FULL -#ifdef __ASSEMBLY__ - #ifdef CONFIG_X86_64 #define SWAPGS swapgs /* @@ -149,6 +147,8 @@ static inline notrace unsigned long arch_local_irq_save(void) swapgs; \ sysretl +#define PARAVIRT_ADJUST_EXCEPTION_FRAME /* */ + #else #define INTERRUPT_RETURN iret #define GET_CR0_INTO_EAX movl %cr0, %eax diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index 2287a2465486..f1680e70162b 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -39,16 +39,6 @@ static inline void write_cr3(unsigned long x) PVOP_VCALL1(pv_mmu_ops.write_cr3, x); } -static inline void arch_safe_halt(void) -{ - PVOP_VCALL0(pv_irq_ops.safe_halt); -} - -static inline void halt(void) -{ - PVOP_VCALL0(pv_irq_ops.halt); -} - #define get_kernel_rpl() (pv_info.kernel_rpl) static inline unsigned long long paravirt_sched_clock(void) @@ -721,11 +711,6 @@ extern void default_banner(void); #define GET_CR2_INTO_RAX \ call PARA_INDIRECT(pv_mmu_ops+PV_MMU_read_cr2) -#define PARAVIRT_ADJUST_EXCEPTION_FRAME \ - PARA_SITE(PARA_PATCH(pv_irq_ops, PV_IRQ_adjust_exception_frame), \ - CLBR_NONE, \ - call PARA_INDIRECT(pv_irq_ops+PV_IRQ_adjust_exception_frame)) - #endif /* CONFIG_X86_64 */ #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/include/asm/paravirt_full.h b/arch/x86/include/asm/paravirt_full.h index b3cf0960c161..64753ef1d36f 100644 --- a/arch/x86/include/asm/paravirt_full.h +++ b/arch/x86/include/asm/paravirt_full.h @@ -231,6 +231,16 @@ static inline void arch_end_context_switch(struct task_struct *next) PVOP_VCALL1(pvfull_cpu_ops.end_context_switch, next); } +static inline void arch_safe_halt(void) +{ + PVOP_VCALL0(pvfull_irq_ops.safe_halt); +} + +static inline void halt(void) +{ + PVOP_VCALL0(pvfull_irq_ops.halt); +} + #else /* __ASSEMBLY__ */ #define INTERRUPT_RETURN \ @@ -267,6 +277,13 @@ static inline void arch_end_context_switch(struct task_struct *next) PARA_SITE(PARA_PATCH(pvfull_cpu_ops, PV_CPU_usergs_sysret64), \ CLBR_NONE, \ jmp PARA_INDIRECT(pvfull_cpu_ops+PV_CPU_usergs_sysret64)) + +#define PARAVIRT_ADJUST_EXCEPTION_FRAME \ + PARA_SITE(PARA_PATCH(pvfull_irq_ops, PV_IRQ_adjust_exception_frame), \ + CLBR_NONE, \ + call PARA_INDIRECT(pvfull_irq_ops + \ + PV_IRQ_adjust_exception_frame)) + #endif /* CONFIG_X86_32 */ #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h index e0fb1291bbdb..de95e6253516 100644 --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -120,13 +120,6 @@ struct pv_irq_ops { struct paravirt_callee_save restore_fl; struct paravirt_callee_save irq_disable; struct paravirt_callee_save irq_enable; - - void (*safe_halt)(void); - void (*halt)(void); - -#ifdef CONFIG_X86_64 - void (*adjust_exception_frame)(void); -#endif }; struct pv_mmu_ops { @@ -266,6 +259,7 @@ struct paravirt_patch_template { struct pv_lock_ops pv_lock_ops; #ifdef CONFIG_PARAVIRT_FULL struct pvfull_cpu_ops pvfull_cpu_ops; + struct pvfull_irq_ops pvfull_irq_ops; #endif }; diff --git a/arch/x86/include/asm/paravirt_types_full.h b/arch/x86/include/asm/paravirt_types_full.h index 50635628f6e8..eabc0ecec8e4 100644 --- a/arch/x86/include/asm/paravirt_types_full.h +++ b/arch/x86/include/asm/paravirt_types_full.h @@ -77,6 +77,16 @@ struct pvfull_cpu_ops { void (*end_context_switch)(struct task_struct *next); }; +struct pvfull_irq_ops { + void (*safe_halt)(void); + void (*halt)(void); + +#ifdef CONFIG_X86_64 + void (*adjust_exception_frame)(void); +#endif +}; + extern struct pvfull_cpu_ops pvfull_cpu_ops; +extern struct pvfull_irq_ops pvfull_irq_ops; #endif /* _ASM_X86_PARAVIRT_TYPES_FULL_H */ diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c index 7b393e453333..a32148390e49 100644 --- a/arch/x86/kernel/asm-offsets.c +++ b/arch/x86/kernel/asm-offsets.c @@ -72,6 +72,8 @@ void common(void) { #ifdef CONFIG_PARAVIRT_FULL OFFSET(PARAVIRT_PATCH_pvfull_cpu_ops, paravirt_patch_template, pvfull_cpu_ops); + OFFSET(PARAVIRT_PATCH_pvfull_irq_ops, paravirt_patch_template, + pvfull_irq_ops); OFFSET(PV_CPU_iret, pvfull_cpu_ops, iret); OFFSET(PV_CPU_read_cr0, pvfull_cpu_ops, read_cr0); #endif diff --git a/arch/x86/kernel/asm-offsets_64.c b/arch/x86/kernel/asm-offsets_64.c index f4fe7d9ac0d9..9a09d7702efc 100644 --- a/arch/x86/kernel/asm-offsets_64.c +++ b/arch/x86/kernel/asm-offsets_64.c @@ -19,10 +19,9 @@ static char syscalls_ia32[] = { int main(void) { -#ifdef CONFIG_PARAVIRT - OFFSET(PV_IRQ_adjust_exception_frame, pv_irq_ops, adjust_exception_frame); -#endif #ifdef CONFIG_PARAVIRT_FULL + OFFSET(PV_IRQ_adjust_exception_frame, pvfull_irq_ops, + adjust_exception_frame); OFFSET(PV_CPU_usergs_sysret64, pvfull_cpu_ops, usergs_sysret64); OFFSET(PV_CPU_swapgs, pvfull_cpu_ops, swapgs); BLANK(); diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index 6b90de65479e..8e22cfc73349 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -127,6 +127,7 @@ static void *get_call_destination(u8 type) #endif #ifdef CONFIG_PARAVIRT_FULL .pvfull_cpu_ops = pvfull_cpu_ops, + .pvfull_irq_ops = pvfull_irq_ops, #endif }; return *((void **)&tmpl + type); @@ -296,11 +297,6 @@ __visible struct pv_irq_ops pv_irq_ops = { .restore_fl = __PV_IS_CALLEE_SAVE(native_restore_fl), .irq_disable = __PV_IS_CALLEE_SAVE(native_irq_disable), .irq_enable = __PV_IS_CALLEE_SAVE(native_irq_enable), - .safe_halt = native_safe_halt, - .halt = native_halt, -#ifdef CONFIG_X86_64 - .adjust_exception_frame = paravirt_nop, -#endif }; __visible struct pv_cpu_ops pv_cpu_ops = { diff --git a/arch/x86/kernel/paravirt_full.c b/arch/x86/kernel/paravirt_full.c index 9b8708421cd2..353968da3ddc 100644 --- a/arch/x86/kernel/paravirt_full.c +++ b/arch/x86/kernel/paravirt_full.c @@ -74,9 +74,18 @@ __visible struct pvfull_cpu_ops pvfull_cpu_ops = { .end_context_switch = paravirt_nop, }; +__visible struct pvfull_irq_ops pvfull_irq_ops = { + .safe_halt = native_safe_halt, + .halt = native_halt, +#ifdef CONFIG_X86_64 + .adjust_exception_frame = paravirt_nop, +#endif +}; + /* At this point, native_get/set_debugreg has real function entries */ NOKPROBE_SYMBOL(native_get_debugreg); NOKPROBE_SYMBOL(native_set_debugreg); NOKPROBE_SYMBOL(native_load_idt); EXPORT_SYMBOL(pvfull_cpu_ops); +EXPORT_SYMBOL_GPL(pvfull_irq_ops); diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c index fa79dbe220ad..bf8773854ab0 100644 --- a/arch/x86/lguest/boot.c +++ b/arch/x86/lguest/boot.c @@ -1404,7 +1404,7 @@ __init void lguest_init(void) pv_irq_ops.restore_fl = __PV_IS_CALLEE_SAVE(lg_restore_fl); pv_irq_ops.irq_disable = PV_CALLEE_SAVE(lguest_irq_disable); pv_irq_ops.irq_enable = __PV_IS_CALLEE_SAVE(lg_irq_enable); - pv_irq_ops.safe_halt = lguest_safe_halt; + pvfull_irq_ops.safe_halt = lguest_safe_halt; /* Setup operations */ pv_init_ops.patch = lguest_patch; diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c index 3b55ae664521..c9dba9d8cecf 100644 --- a/arch/x86/xen/irq.c +++ b/arch/x86/xen/irq.c @@ -120,7 +120,9 @@ static const struct pv_irq_ops xen_irq_ops __initconst = { .restore_fl = PV_CALLEE_SAVE(xen_restore_fl), .irq_disable = PV_CALLEE_SAVE(xen_irq_disable), .irq_enable = PV_CALLEE_SAVE(xen_irq_enable), +}; +static const struct pvfull_irq_ops xen_full_irq_ops __initconst = { .safe_halt = xen_safe_halt, .halt = xen_halt, #ifdef CONFIG_X86_64 @@ -131,5 +133,6 @@ static const struct pv_irq_ops xen_irq_ops __initconst = { void __init xen_init_irq_ops(void) { pv_irq_ops = xen_irq_ops; + pvfull_irq_ops = xen_full_irq_ops; x86_init.irqs.intr_init = xen_init_IRQ; } -- 2.12.0
Juergen Gross
2017-May-19 15:47 UTC
[PATCH 08/10] paravirt: split pv_mmu_ops for support of PARAVIRT_FULL
Move functions needed for fully paravirtualized guests only into a new structure pvfull_mmu_ops in paravirt_types_full.h, paravirt_full.h and the associated vector into paravirt_full.c. .flush_tlb_others is left in pv_mmu_ops as hyperv support will use it soon. Signed-off-by: Juergen Gross <jgross at suse.com> --- arch/x86/include/asm/fixmap.h | 2 +- arch/x86/include/asm/mmu_context.h | 4 +- arch/x86/include/asm/paravirt.h | 442 ++--------------------------- arch/x86/include/asm/paravirt_full.h | 422 +++++++++++++++++++++++++++ arch/x86/include/asm/paravirt_types.h | 117 +------- arch/x86/include/asm/paravirt_types_full.h | 116 ++++++++ arch/x86/include/asm/pgalloc.h | 2 +- arch/x86/include/asm/pgtable.h | 8 +- arch/x86/include/asm/special_insns.h | 6 +- arch/x86/include/asm/tlbflush.h | 2 +- arch/x86/kernel/asm-offsets.c | 4 +- arch/x86/kernel/head_64.S | 2 +- arch/x86/kernel/paravirt.c | 171 ----------- arch/x86/kernel/paravirt_full.c | 176 ++++++++++++ arch/x86/kernel/paravirt_patch_32.c | 12 +- arch/x86/kernel/paravirt_patch_64.c | 16 +- arch/x86/lguest/boot.c | 36 +-- arch/x86/xen/enlighten_pv.c | 8 +- arch/x86/xen/mmu_pv.c | 34 +-- 19 files changed, 797 insertions(+), 783 deletions(-) diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h index b65155cc3760..dfef874cb9d6 100644 --- a/arch/x86/include/asm/fixmap.h +++ b/arch/x86/include/asm/fixmap.h @@ -149,7 +149,7 @@ void __native_set_fixmap(enum fixed_addresses idx, pte_t pte); void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t flags); -#ifndef CONFIG_PARAVIRT +#ifndef CONFIG_PARAVIRT_FULL static inline void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t flags) { diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 68b329d77b3a..b38431024463 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -12,12 +12,12 @@ #include <asm/tlbflush.h> #include <asm/paravirt.h> #include <asm/mpx.h> -#ifndef CONFIG_PARAVIRT +#ifndef CONFIG_PARAVIRT_FULL static inline void paravirt_activate_mm(struct mm_struct *prev, struct mm_struct *next) { } -#endif /* !CONFIG_PARAVIRT */ +#endif /* !CONFIG_PARAVIRT_FULL */ #ifdef CONFIG_PERF_EVENTS extern struct static_key rdpmc_always_available; diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index f1680e70162b..3b9960a5de4a 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -17,28 +17,15 @@ #ifdef CONFIG_PARAVIRT_FULL #include <asm/paravirt_full.h> +#else + +static inline enum paravirt_lazy_mode paravirt_get_lazy_mode(void) +{ + return PARAVIRT_LAZY_NONE; +} + #endif -static inline unsigned long read_cr2(void) -{ - return PVOP_CALL0(unsigned long, pv_mmu_ops.read_cr2); -} - -static inline void write_cr2(unsigned long x) -{ - PVOP_VCALL1(pv_mmu_ops.write_cr2, x); -} - -static inline unsigned long read_cr3(void) -{ - return PVOP_CALL0(unsigned long, pv_mmu_ops.read_cr3); -} - -static inline void write_cr3(unsigned long x) -{ - PVOP_VCALL1(pv_mmu_ops.write_cr3, x); -} - #define get_kernel_rpl() (pv_info.kernel_rpl) static inline unsigned long long paravirt_sched_clock(void) @@ -66,36 +53,11 @@ static inline void slow_down_io(void) #endif } -static inline void paravirt_activate_mm(struct mm_struct *prev, - struct mm_struct *next) -{ - PVOP_VCALL2(pv_mmu_ops.activate_mm, prev, next); -} - -static inline void paravirt_arch_dup_mmap(struct mm_struct *oldmm, - struct mm_struct *mm) -{ - PVOP_VCALL2(pv_mmu_ops.dup_mmap, oldmm, mm); -} - static inline void paravirt_arch_exit_mmap(struct mm_struct *mm) { PVOP_VCALL1(pv_mmu_ops.exit_mmap, mm); } -static inline void __flush_tlb(void) -{ - PVOP_VCALL0(pv_mmu_ops.flush_tlb_user); -} -static inline void __flush_tlb_global(void) -{ - PVOP_VCALL0(pv_mmu_ops.flush_tlb_kernel); -} -static inline void __flush_tlb_single(unsigned long addr) -{ - PVOP_VCALL1(pv_mmu_ops.flush_tlb_single, addr); -} - static inline void flush_tlb_others(const struct cpumask *cpumask, struct mm_struct *mm, unsigned long start, @@ -104,375 +66,6 @@ static inline void flush_tlb_others(const struct cpumask *cpumask, PVOP_VCALL4(pv_mmu_ops.flush_tlb_others, cpumask, mm, start, end); } -static inline int paravirt_pgd_alloc(struct mm_struct *mm) -{ - return PVOP_CALL1(int, pv_mmu_ops.pgd_alloc, mm); -} - -static inline void paravirt_pgd_free(struct mm_struct *mm, pgd_t *pgd) -{ - PVOP_VCALL2(pv_mmu_ops.pgd_free, mm, pgd); -} - -static inline void paravirt_alloc_pte(struct mm_struct *mm, unsigned long pfn) -{ - PVOP_VCALL2(pv_mmu_ops.alloc_pte, mm, pfn); -} -static inline void paravirt_release_pte(unsigned long pfn) -{ - PVOP_VCALL1(pv_mmu_ops.release_pte, pfn); -} - -static inline void paravirt_alloc_pmd(struct mm_struct *mm, unsigned long pfn) -{ - PVOP_VCALL2(pv_mmu_ops.alloc_pmd, mm, pfn); -} - -static inline void paravirt_release_pmd(unsigned long pfn) -{ - PVOP_VCALL1(pv_mmu_ops.release_pmd, pfn); -} - -static inline void paravirt_alloc_pud(struct mm_struct *mm, unsigned long pfn) -{ - PVOP_VCALL2(pv_mmu_ops.alloc_pud, mm, pfn); -} -static inline void paravirt_release_pud(unsigned long pfn) -{ - PVOP_VCALL1(pv_mmu_ops.release_pud, pfn); -} - -static inline void paravirt_alloc_p4d(struct mm_struct *mm, unsigned long pfn) -{ - PVOP_VCALL2(pv_mmu_ops.alloc_p4d, mm, pfn); -} - -static inline void paravirt_release_p4d(unsigned long pfn) -{ - PVOP_VCALL1(pv_mmu_ops.release_p4d, pfn); -} - -static inline void pte_update(struct mm_struct *mm, unsigned long addr, - pte_t *ptep) -{ - PVOP_VCALL3(pv_mmu_ops.pte_update, mm, addr, ptep); -} - -static inline pte_t __pte(pteval_t val) -{ - pteval_t ret; - - if (sizeof(pteval_t) > sizeof(long)) - ret = PVOP_CALLEE2(pteval_t, - pv_mmu_ops.make_pte, - val, (u64)val >> 32); - else - ret = PVOP_CALLEE1(pteval_t, - pv_mmu_ops.make_pte, - val); - - return (pte_t) { .pte = ret }; -} - -static inline pteval_t pte_val(pte_t pte) -{ - pteval_t ret; - - if (sizeof(pteval_t) > sizeof(long)) - ret = PVOP_CALLEE2(pteval_t, pv_mmu_ops.pte_val, - pte.pte, (u64)pte.pte >> 32); - else - ret = PVOP_CALLEE1(pteval_t, pv_mmu_ops.pte_val, - pte.pte); - - return ret; -} - -static inline pgd_t __pgd(pgdval_t val) -{ - pgdval_t ret; - - if (sizeof(pgdval_t) > sizeof(long)) - ret = PVOP_CALLEE2(pgdval_t, pv_mmu_ops.make_pgd, - val, (u64)val >> 32); - else - ret = PVOP_CALLEE1(pgdval_t, pv_mmu_ops.make_pgd, - val); - - return (pgd_t) { ret }; -} - -static inline pgdval_t pgd_val(pgd_t pgd) -{ - pgdval_t ret; - - if (sizeof(pgdval_t) > sizeof(long)) - ret = PVOP_CALLEE2(pgdval_t, pv_mmu_ops.pgd_val, - pgd.pgd, (u64)pgd.pgd >> 32); - else - ret = PVOP_CALLEE1(pgdval_t, pv_mmu_ops.pgd_val, - pgd.pgd); - - return ret; -} - -#define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION -static inline pte_t ptep_modify_prot_start(struct mm_struct *mm, unsigned long addr, - pte_t *ptep) -{ - pteval_t ret; - - ret = PVOP_CALL3(pteval_t, pv_mmu_ops.ptep_modify_prot_start, - mm, addr, ptep); - - return (pte_t) { .pte = ret }; -} - -static inline void ptep_modify_prot_commit(struct mm_struct *mm, unsigned long addr, - pte_t *ptep, pte_t pte) -{ - if (sizeof(pteval_t) > sizeof(long)) - /* 5 arg words */ - pv_mmu_ops.ptep_modify_prot_commit(mm, addr, ptep, pte); - else - PVOP_VCALL4(pv_mmu_ops.ptep_modify_prot_commit, - mm, addr, ptep, pte.pte); -} - -static inline void set_pte(pte_t *ptep, pte_t pte) -{ - if (sizeof(pteval_t) > sizeof(long)) - PVOP_VCALL3(pv_mmu_ops.set_pte, ptep, - pte.pte, (u64)pte.pte >> 32); - else - PVOP_VCALL2(pv_mmu_ops.set_pte, ptep, - pte.pte); -} - -static inline void set_pte_at(struct mm_struct *mm, unsigned long addr, - pte_t *ptep, pte_t pte) -{ - if (sizeof(pteval_t) > sizeof(long)) - /* 5 arg words */ - pv_mmu_ops.set_pte_at(mm, addr, ptep, pte); - else - PVOP_VCALL4(pv_mmu_ops.set_pte_at, mm, addr, ptep, pte.pte); -} - -static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr, - pmd_t *pmdp, pmd_t pmd) -{ - if (sizeof(pmdval_t) > sizeof(long)) - /* 5 arg words */ - pv_mmu_ops.set_pmd_at(mm, addr, pmdp, pmd); - else - PVOP_VCALL4(pv_mmu_ops.set_pmd_at, mm, addr, pmdp, - native_pmd_val(pmd)); -} - -static inline void set_pud_at(struct mm_struct *mm, unsigned long addr, - pud_t *pudp, pud_t pud) -{ - if (sizeof(pudval_t) > sizeof(long)) - /* 5 arg words */ - pv_mmu_ops.set_pud_at(mm, addr, pudp, pud); - else - PVOP_VCALL4(pv_mmu_ops.set_pud_at, mm, addr, pudp, - native_pud_val(pud)); -} - -static inline void set_pmd(pmd_t *pmdp, pmd_t pmd) -{ - pmdval_t val = native_pmd_val(pmd); - - if (sizeof(pmdval_t) > sizeof(long)) - PVOP_VCALL3(pv_mmu_ops.set_pmd, pmdp, val, (u64)val >> 32); - else - PVOP_VCALL2(pv_mmu_ops.set_pmd, pmdp, val); -} - -#if CONFIG_PGTABLE_LEVELS >= 3 -static inline pmd_t __pmd(pmdval_t val) -{ - pmdval_t ret; - - if (sizeof(pmdval_t) > sizeof(long)) - ret = PVOP_CALLEE2(pmdval_t, pv_mmu_ops.make_pmd, - val, (u64)val >> 32); - else - ret = PVOP_CALLEE1(pmdval_t, pv_mmu_ops.make_pmd, - val); - - return (pmd_t) { ret }; -} - -static inline pmdval_t pmd_val(pmd_t pmd) -{ - pmdval_t ret; - - if (sizeof(pmdval_t) > sizeof(long)) - ret = PVOP_CALLEE2(pmdval_t, pv_mmu_ops.pmd_val, - pmd.pmd, (u64)pmd.pmd >> 32); - else - ret = PVOP_CALLEE1(pmdval_t, pv_mmu_ops.pmd_val, - pmd.pmd); - - return ret; -} - -static inline void set_pud(pud_t *pudp, pud_t pud) -{ - pudval_t val = native_pud_val(pud); - - if (sizeof(pudval_t) > sizeof(long)) - PVOP_VCALL3(pv_mmu_ops.set_pud, pudp, - val, (u64)val >> 32); - else - PVOP_VCALL2(pv_mmu_ops.set_pud, pudp, - val); -} -#if CONFIG_PGTABLE_LEVELS >= 4 -static inline pud_t __pud(pudval_t val) -{ - pudval_t ret; - - if (sizeof(pudval_t) > sizeof(long)) - ret = PVOP_CALLEE2(pudval_t, pv_mmu_ops.make_pud, - val, (u64)val >> 32); - else - ret = PVOP_CALLEE1(pudval_t, pv_mmu_ops.make_pud, - val); - - return (pud_t) { ret }; -} - -static inline pudval_t pud_val(pud_t pud) -{ - pudval_t ret; - - if (sizeof(pudval_t) > sizeof(long)) - ret = PVOP_CALLEE2(pudval_t, pv_mmu_ops.pud_val, - pud.pud, (u64)pud.pud >> 32); - else - ret = PVOP_CALLEE1(pudval_t, pv_mmu_ops.pud_val, - pud.pud); - - return ret; -} - -static inline void pud_clear(pud_t *pudp) -{ - set_pud(pudp, __pud(0)); -} - -static inline void set_p4d(p4d_t *p4dp, p4d_t p4d) -{ - p4dval_t val = native_p4d_val(p4d); - - if (sizeof(p4dval_t) > sizeof(long)) - PVOP_VCALL3(pv_mmu_ops.set_p4d, p4dp, - val, (u64)val >> 32); - else - PVOP_VCALL2(pv_mmu_ops.set_p4d, p4dp, - val); -} - -#if CONFIG_PGTABLE_LEVELS >= 5 - -static inline p4d_t __p4d(p4dval_t val) -{ - p4dval_t ret = PVOP_CALLEE1(p4dval_t, pv_mmu_ops.make_p4d, val); - - return (p4d_t) { ret }; -} - -static inline p4dval_t p4d_val(p4d_t p4d) -{ - return PVOP_CALLEE1(p4dval_t, pv_mmu_ops.p4d_val, p4d.p4d); -} - -static inline void set_pgd(pgd_t *pgdp, pgd_t pgd) -{ - pgdval_t val = native_pgd_val(pgd); - - PVOP_VCALL2(pv_mmu_ops.set_pgd, pgdp, val); -} - -static inline void pgd_clear(pgd_t *pgdp) -{ - set_pgd(pgdp, __pgd(0)); -} - -#endif /* CONFIG_PGTABLE_LEVELS == 5 */ - -static inline void p4d_clear(p4d_t *p4dp) -{ - set_p4d(p4dp, __p4d(0)); -} - -#endif /* CONFIG_PGTABLE_LEVELS == 4 */ - -#endif /* CONFIG_PGTABLE_LEVELS >= 3 */ - -#ifdef CONFIG_X86_PAE -/* Special-case pte-setting operations for PAE, which can't update a - 64-bit pte atomically */ -static inline void set_pte_atomic(pte_t *ptep, pte_t pte) -{ - PVOP_VCALL3(pv_mmu_ops.set_pte_atomic, ptep, - pte.pte, pte.pte >> 32); -} - -static inline void pte_clear(struct mm_struct *mm, unsigned long addr, - pte_t *ptep) -{ - PVOP_VCALL3(pv_mmu_ops.pte_clear, mm, addr, ptep); -} - -static inline void pmd_clear(pmd_t *pmdp) -{ - PVOP_VCALL1(pv_mmu_ops.pmd_clear, pmdp); -} -#else /* !CONFIG_X86_PAE */ -static inline void set_pte_atomic(pte_t *ptep, pte_t pte) -{ - set_pte(ptep, pte); -} - -static inline void pte_clear(struct mm_struct *mm, unsigned long addr, - pte_t *ptep) -{ - set_pte_at(mm, addr, ptep, __pte(0)); -} - -static inline void pmd_clear(pmd_t *pmdp) -{ - set_pmd(pmdp, __pmd(0)); -} -#endif /* CONFIG_X86_PAE */ - -#define __HAVE_ARCH_ENTER_LAZY_MMU_MODE -static inline void arch_enter_lazy_mmu_mode(void) -{ - PVOP_VCALL0(pv_mmu_ops.lazy_mode.enter); -} - -static inline void arch_leave_lazy_mmu_mode(void) -{ - PVOP_VCALL0(pv_mmu_ops.lazy_mode.leave); -} - -static inline void arch_flush_lazy_mmu_mode(void) -{ - PVOP_VCALL0(pv_mmu_ops.lazy_mode.flush); -} - -static inline void __set_fixmap(unsigned /* enum fixed_addresses */ idx, - phys_addr_t phys, pgprot_t flags) -{ - pv_mmu_ops.set_fixmap(idx, phys, flags); -} - #if defined(CONFIG_SMP) && defined(CONFIG_PARAVIRT_SPINLOCKS) static __always_inline void pv_queued_spin_lock_slowpath(struct qspinlock *lock, @@ -706,25 +299,22 @@ extern void default_banner(void); call PARA_INDIRECT(pv_irq_ops+PV_IRQ_irq_enable); \ PV_RESTORE_REGS(clobbers | CLBR_CALLEE_SAVE);) -#ifdef CONFIG_X86_64 - -#define GET_CR2_INTO_RAX \ - call PARA_INDIRECT(pv_mmu_ops+PV_MMU_read_cr2) - -#endif /* CONFIG_X86_64 */ - #endif /* __ASSEMBLY__ */ #else /* CONFIG_PARAVIRT */ # define default_banner x86_init_noop #ifndef __ASSEMBLY__ -static inline void paravirt_arch_dup_mmap(struct mm_struct *oldmm, - struct mm_struct *mm) -{ -} - static inline void paravirt_arch_exit_mmap(struct mm_struct *mm) { } #endif /* __ASSEMBLY__ */ #endif /* !CONFIG_PARAVIRT */ + +#ifndef CONFIG_PARAVIRT_FULL +#ifndef __ASSEMBLY__ +static inline void paravirt_arch_dup_mmap(struct mm_struct *oldmm, + struct mm_struct *mm) +{ +} +#endif /* __ASSEMBLY__ */ +#endif /* CONFIG_PARAVIRT_FULL */ #endif /* _ASM_X86_PARAVIRT_H */ diff --git a/arch/x86/include/asm/paravirt_full.h b/arch/x86/include/asm/paravirt_full.h index 64753ef1d36f..53f2eb436ba3 100644 --- a/arch/x86/include/asm/paravirt_full.h +++ b/arch/x86/include/asm/paravirt_full.h @@ -241,6 +241,425 @@ static inline void halt(void) PVOP_VCALL0(pvfull_irq_ops.halt); } +static inline unsigned long read_cr2(void) +{ + return PVOP_CALL0(unsigned long, pvfull_mmu_ops.read_cr2); +} + +static inline void write_cr2(unsigned long x) +{ + PVOP_VCALL1(pvfull_mmu_ops.write_cr2, x); +} + +static inline unsigned long read_cr3(void) +{ + return PVOP_CALL0(unsigned long, pvfull_mmu_ops.read_cr3); +} + +static inline void write_cr3(unsigned long x) +{ + PVOP_VCALL1(pvfull_mmu_ops.write_cr3, x); +} + +static inline void paravirt_activate_mm(struct mm_struct *prev, + struct mm_struct *next) +{ + PVOP_VCALL2(pvfull_mmu_ops.activate_mm, prev, next); +} + +static inline void paravirt_arch_dup_mmap(struct mm_struct *oldmm, + struct mm_struct *mm) +{ + PVOP_VCALL2(pvfull_mmu_ops.dup_mmap, oldmm, mm); +} + +static inline void __flush_tlb(void) +{ + PVOP_VCALL0(pvfull_mmu_ops.flush_tlb_user); +} + +static inline void __flush_tlb_global(void) +{ + PVOP_VCALL0(pvfull_mmu_ops.flush_tlb_kernel); +} + +static inline void __flush_tlb_single(unsigned long addr) +{ + PVOP_VCALL1(pvfull_mmu_ops.flush_tlb_single, addr); +} + +static inline int paravirt_pgd_alloc(struct mm_struct *mm) +{ + return PVOP_CALL1(int, pvfull_mmu_ops.pgd_alloc, mm); +} + +static inline void paravirt_pgd_free(struct mm_struct *mm, pgd_t *pgd) +{ + PVOP_VCALL2(pvfull_mmu_ops.pgd_free, mm, pgd); +} + +static inline void paravirt_alloc_pte(struct mm_struct *mm, unsigned long pfn) +{ + PVOP_VCALL2(pvfull_mmu_ops.alloc_pte, mm, pfn); +} + +static inline void paravirt_release_pte(unsigned long pfn) +{ + PVOP_VCALL1(pvfull_mmu_ops.release_pte, pfn); +} + +static inline void paravirt_alloc_pmd(struct mm_struct *mm, unsigned long pfn) +{ + PVOP_VCALL2(pvfull_mmu_ops.alloc_pmd, mm, pfn); +} + +static inline void paravirt_release_pmd(unsigned long pfn) +{ + PVOP_VCALL1(pvfull_mmu_ops.release_pmd, pfn); +} + +static inline void paravirt_alloc_pud(struct mm_struct *mm, unsigned long pfn) +{ + PVOP_VCALL2(pvfull_mmu_ops.alloc_pud, mm, pfn); +} + +static inline void paravirt_release_pud(unsigned long pfn) +{ + PVOP_VCALL1(pvfull_mmu_ops.release_pud, pfn); +} + +static inline void paravirt_alloc_p4d(struct mm_struct *mm, unsigned long pfn) +{ + PVOP_VCALL2(pvfull_mmu_ops.alloc_p4d, mm, pfn); +} + +static inline void paravirt_release_p4d(unsigned long pfn) +{ + PVOP_VCALL1(pvfull_mmu_ops.release_p4d, pfn); +} + +static inline void pte_update(struct mm_struct *mm, unsigned long addr, + pte_t *ptep) +{ + PVOP_VCALL3(pvfull_mmu_ops.pte_update, mm, addr, ptep); +} + +static inline pte_t __pte(pteval_t val) +{ + pteval_t ret; + + if (sizeof(pteval_t) > sizeof(long)) + ret = PVOP_CALLEE2(pteval_t, + pvfull_mmu_ops.make_pte, + val, (u64)val >> 32); + else + ret = PVOP_CALLEE1(pteval_t, + pvfull_mmu_ops.make_pte, + val); + + return (pte_t) { .pte = ret }; +} + +static inline pteval_t pte_val(pte_t pte) +{ + pteval_t ret; + + if (sizeof(pteval_t) > sizeof(long)) + ret = PVOP_CALLEE2(pteval_t, pvfull_mmu_ops.pte_val, + pte.pte, (u64)pte.pte >> 32); + else + ret = PVOP_CALLEE1(pteval_t, pvfull_mmu_ops.pte_val, + pte.pte); + + return ret; +} + +static inline pgd_t __pgd(pgdval_t val) +{ + pgdval_t ret; + + if (sizeof(pgdval_t) > sizeof(long)) + ret = PVOP_CALLEE2(pgdval_t, pvfull_mmu_ops.make_pgd, + val, (u64)val >> 32); + else + ret = PVOP_CALLEE1(pgdval_t, pvfull_mmu_ops.make_pgd, + val); + + return (pgd_t) { ret }; +} + +static inline pgdval_t pgd_val(pgd_t pgd) +{ + pgdval_t ret; + + if (sizeof(pgdval_t) > sizeof(long)) + ret = PVOP_CALLEE2(pgdval_t, pvfull_mmu_ops.pgd_val, + pgd.pgd, (u64)pgd.pgd >> 32); + else + ret = PVOP_CALLEE1(pgdval_t, pvfull_mmu_ops.pgd_val, + pgd.pgd); + + return ret; +} + +#define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION +static inline pte_t ptep_modify_prot_start(struct mm_struct *mm, + unsigned long addr, pte_t *ptep) +{ + pteval_t ret; + + ret = PVOP_CALL3(pteval_t, pvfull_mmu_ops.ptep_modify_prot_start, + mm, addr, ptep); + + return (pte_t) { .pte = ret }; +} + +static inline void ptep_modify_prot_commit(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, + pte_t pte) +{ + if (sizeof(pteval_t) > sizeof(long)) + /* 5 arg words */ + pvfull_mmu_ops.ptep_modify_prot_commit(mm, addr, ptep, pte); + else + PVOP_VCALL4(pvfull_mmu_ops.ptep_modify_prot_commit, + mm, addr, ptep, pte.pte); +} + +static inline void set_pte(pte_t *ptep, pte_t pte) +{ + if (sizeof(pteval_t) > sizeof(long)) + PVOP_VCALL3(pvfull_mmu_ops.set_pte, ptep, + pte.pte, (u64)pte.pte >> 32); + else + PVOP_VCALL2(pvfull_mmu_ops.set_pte, ptep, + pte.pte); +} + +static inline void set_pte_at(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte) +{ + if (sizeof(pteval_t) > sizeof(long)) + /* 5 arg words */ + pvfull_mmu_ops.set_pte_at(mm, addr, ptep, pte); + else + PVOP_VCALL4(pvfull_mmu_ops.set_pte_at, mm, addr, ptep, pte.pte); +} + +static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr, + pmd_t *pmdp, pmd_t pmd) +{ + if (sizeof(pmdval_t) > sizeof(long)) + /* 5 arg words */ + pvfull_mmu_ops.set_pmd_at(mm, addr, pmdp, pmd); + else + PVOP_VCALL4(pvfull_mmu_ops.set_pmd_at, mm, addr, pmdp, + native_pmd_val(pmd)); +} + +static inline void set_pud_at(struct mm_struct *mm, unsigned long addr, + pud_t *pudp, pud_t pud) +{ + if (sizeof(pudval_t) > sizeof(long)) + /* 5 arg words */ + pvfull_mmu_ops.set_pud_at(mm, addr, pudp, pud); + else + PVOP_VCALL4(pvfull_mmu_ops.set_pud_at, mm, addr, pudp, + native_pud_val(pud)); +} + +static inline void set_pmd(pmd_t *pmdp, pmd_t pmd) +{ + pmdval_t val = native_pmd_val(pmd); + + if (sizeof(pmdval_t) > sizeof(long)) + PVOP_VCALL3(pvfull_mmu_ops.set_pmd, pmdp, val, (u64)val >> 32); + else + PVOP_VCALL2(pvfull_mmu_ops.set_pmd, pmdp, val); +} + +#if CONFIG_PGTABLE_LEVELS >= 3 +static inline pmd_t __pmd(pmdval_t val) +{ + pmdval_t ret; + + if (sizeof(pmdval_t) > sizeof(long)) + ret = PVOP_CALLEE2(pmdval_t, pvfull_mmu_ops.make_pmd, + val, (u64)val >> 32); + else + ret = PVOP_CALLEE1(pmdval_t, pvfull_mmu_ops.make_pmd, + val); + + return (pmd_t) { ret }; +} + +static inline pmdval_t pmd_val(pmd_t pmd) +{ + pmdval_t ret; + + if (sizeof(pmdval_t) > sizeof(long)) + ret = PVOP_CALLEE2(pmdval_t, pvfull_mmu_ops.pmd_val, + pmd.pmd, (u64)pmd.pmd >> 32); + else + ret = PVOP_CALLEE1(pmdval_t, pvfull_mmu_ops.pmd_val, + pmd.pmd); + + return ret; +} + +static inline void set_pud(pud_t *pudp, pud_t pud) +{ + pudval_t val = native_pud_val(pud); + + if (sizeof(pudval_t) > sizeof(long)) + PVOP_VCALL3(pvfull_mmu_ops.set_pud, pudp, + val, (u64)val >> 32); + else + PVOP_VCALL2(pvfull_mmu_ops.set_pud, pudp, + val); +} + +#if CONFIG_PGTABLE_LEVELS >= 4 +static inline pud_t __pud(pudval_t val) +{ + pudval_t ret; + + if (sizeof(pudval_t) > sizeof(long)) + ret = PVOP_CALLEE2(pudval_t, pvfull_mmu_ops.make_pud, + val, (u64)val >> 32); + else + ret = PVOP_CALLEE1(pudval_t, pvfull_mmu_ops.make_pud, + val); + + return (pud_t) { ret }; +} + +static inline pudval_t pud_val(pud_t pud) +{ + pudval_t ret; + + if (sizeof(pudval_t) > sizeof(long)) + ret = PVOP_CALLEE2(pudval_t, pvfull_mmu_ops.pud_val, + pud.pud, (u64)pud.pud >> 32); + else + ret = PVOP_CALLEE1(pudval_t, pvfull_mmu_ops.pud_val, + pud.pud); + + return ret; +} + +static inline void pud_clear(pud_t *pudp) +{ + set_pud(pudp, __pud(0)); +} + +static inline void set_p4d(p4d_t *p4dp, p4d_t p4d) +{ + p4dval_t val = native_p4d_val(p4d); + + if (sizeof(p4dval_t) > sizeof(long)) + PVOP_VCALL3(pvfull_mmu_ops.set_p4d, p4dp, + val, (u64)val >> 32); + else + PVOP_VCALL2(pvfull_mmu_ops.set_p4d, p4dp, + val); +} + +#if CONFIG_PGTABLE_LEVELS >= 5 +static inline p4d_t __p4d(p4dval_t val) +{ + p4dval_t ret = PVOP_CALLEE1(p4dval_t, pvfull_mmu_ops.make_p4d, val); + + return (p4d_t) { ret }; +} + +static inline p4dval_t p4d_val(p4d_t p4d) +{ + return PVOP_CALLEE1(p4dval_t, pvfull_mmu_ops.p4d_val, p4d.p4d); +} + +static inline void set_pgd(pgd_t *pgdp, pgd_t pgd) +{ + pgdval_t val = native_pgd_val(pgd); + + PVOP_VCALL2(pvfull_mmu_ops.set_pgd, pgdp, val); +} + +static inline void pgd_clear(pgd_t *pgdp) +{ + set_pgd(pgdp, __pgd(0)); +} + +#endif /* CONFIG_PGTABLE_LEVELS >= 5 */ + +static inline void p4d_clear(p4d_t *p4dp) +{ + set_p4d(p4dp, __p4d(0)); +} + +#endif /* CONFIG_PGTABLE_LEVELS >= 4 */ + +#endif /* CONFIG_PGTABLE_LEVELS >= 3 */ + +#ifdef CONFIG_X86_PAE +/* Special-case pte-setting operations for PAE, which can't update a + 64-bit pte atomically */ +static inline void set_pte_atomic(pte_t *ptep, pte_t pte) +{ + PVOP_VCALL3(pvfull_mmu_ops.set_pte_atomic, ptep, + pte.pte, pte.pte >> 32); +} + +static inline void pte_clear(struct mm_struct *mm, unsigned long addr, + pte_t *ptep) +{ + PVOP_VCALL3(pvfull_mmu_ops.pte_clear, mm, addr, ptep); +} + +static inline void pmd_clear(pmd_t *pmdp) +{ + PVOP_VCALL1(pvfull_mmu_ops.pmd_clear, pmdp); +} +#else /* !CONFIG_X86_PAE */ +static inline void set_pte_atomic(pte_t *ptep, pte_t pte) +{ + set_pte(ptep, pte); +} + +static inline void pte_clear(struct mm_struct *mm, unsigned long addr, + pte_t *ptep) +{ + set_pte_at(mm, addr, ptep, __pte(0)); +} + +static inline void pmd_clear(pmd_t *pmdp) +{ + set_pmd(pmdp, __pmd(0)); +} +#endif /* CONFIG_X86_PAE */ + +#define __HAVE_ARCH_ENTER_LAZY_MMU_MODE +static inline void arch_enter_lazy_mmu_mode(void) +{ + PVOP_VCALL0(pvfull_mmu_ops.lazy_mode.enter); +} + +static inline void arch_leave_lazy_mmu_mode(void) +{ + PVOP_VCALL0(pvfull_mmu_ops.lazy_mode.leave); +} + +static inline void arch_flush_lazy_mmu_mode(void) +{ + PVOP_VCALL0(pvfull_mmu_ops.lazy_mode.flush); +} + +static inline void __set_fixmap(unsigned /* enum fixed_addresses */ idx, + phys_addr_t phys, pgprot_t flags) +{ + pvfull_mmu_ops.set_fixmap(idx, phys, flags); +} + #else /* __ASSEMBLY__ */ #define INTERRUPT_RETURN \ @@ -284,6 +703,9 @@ static inline void halt(void) call PARA_INDIRECT(pvfull_irq_ops + \ PV_IRQ_adjust_exception_frame)) +#define GET_CR2_INTO_RAX \ + call PARA_INDIRECT(pvfull_mmu_ops+PV_MMU_read_cr2) + #endif /* CONFIG_X86_32 */ #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h index de95e6253516..b1ac2a5698b4 100644 --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -89,14 +89,6 @@ struct pv_init_ops { unsigned long addr, unsigned len); }; - -struct pv_lazy_ops { - /* Set deferred update mode, used for batching operations. */ - void (*enter)(void); - void (*leave)(void); - void (*flush)(void); -}; - struct pv_time_ops { unsigned long long (*sched_clock)(void); unsigned long long (*steal_clock)(int cpu); @@ -123,111 +115,11 @@ struct pv_irq_ops { }; struct pv_mmu_ops { - unsigned long (*read_cr2)(void); - void (*write_cr2)(unsigned long); - - unsigned long (*read_cr3)(void); - void (*write_cr3)(unsigned long); - - /* - * Hooks for intercepting the creation/use/destruction of an - * mm_struct. - */ - void (*activate_mm)(struct mm_struct *prev, - struct mm_struct *next); - void (*dup_mmap)(struct mm_struct *oldmm, - struct mm_struct *mm); void (*exit_mmap)(struct mm_struct *mm); - - - /* TLB operations */ - void (*flush_tlb_user)(void); - void (*flush_tlb_kernel)(void); - void (*flush_tlb_single)(unsigned long addr); void (*flush_tlb_others)(const struct cpumask *cpus, struct mm_struct *mm, unsigned long start, unsigned long end); - - /* Hooks for allocating and freeing a pagetable top-level */ - int (*pgd_alloc)(struct mm_struct *mm); - void (*pgd_free)(struct mm_struct *mm, pgd_t *pgd); - - /* - * Hooks for allocating/releasing pagetable pages when they're - * attached to a pagetable - */ - void (*alloc_pte)(struct mm_struct *mm, unsigned long pfn); - void (*alloc_pmd)(struct mm_struct *mm, unsigned long pfn); - void (*alloc_pud)(struct mm_struct *mm, unsigned long pfn); - void (*alloc_p4d)(struct mm_struct *mm, unsigned long pfn); - void (*release_pte)(unsigned long pfn); - void (*release_pmd)(unsigned long pfn); - void (*release_pud)(unsigned long pfn); - void (*release_p4d)(unsigned long pfn); - - /* Pagetable manipulation functions */ - void (*set_pte)(pte_t *ptep, pte_t pteval); - void (*set_pte_at)(struct mm_struct *mm, unsigned long addr, - pte_t *ptep, pte_t pteval); - void (*set_pmd)(pmd_t *pmdp, pmd_t pmdval); - void (*set_pmd_at)(struct mm_struct *mm, unsigned long addr, - pmd_t *pmdp, pmd_t pmdval); - void (*set_pud_at)(struct mm_struct *mm, unsigned long addr, - pud_t *pudp, pud_t pudval); - void (*pte_update)(struct mm_struct *mm, unsigned long addr, - pte_t *ptep); - - pte_t (*ptep_modify_prot_start)(struct mm_struct *mm, unsigned long addr, - pte_t *ptep); - void (*ptep_modify_prot_commit)(struct mm_struct *mm, unsigned long addr, - pte_t *ptep, pte_t pte); - - struct paravirt_callee_save pte_val; - struct paravirt_callee_save make_pte; - - struct paravirt_callee_save pgd_val; - struct paravirt_callee_save make_pgd; - -#if CONFIG_PGTABLE_LEVELS >= 3 -#ifdef CONFIG_X86_PAE - void (*set_pte_atomic)(pte_t *ptep, pte_t pteval); - void (*pte_clear)(struct mm_struct *mm, unsigned long addr, - pte_t *ptep); - void (*pmd_clear)(pmd_t *pmdp); - -#endif /* CONFIG_X86_PAE */ - - void (*set_pud)(pud_t *pudp, pud_t pudval); - - struct paravirt_callee_save pmd_val; - struct paravirt_callee_save make_pmd; - -#if CONFIG_PGTABLE_LEVELS >= 4 - struct paravirt_callee_save pud_val; - struct paravirt_callee_save make_pud; - - void (*set_p4d)(p4d_t *p4dp, p4d_t p4dval); - -#if CONFIG_PGTABLE_LEVELS >= 5 - struct paravirt_callee_save p4d_val; - struct paravirt_callee_save make_p4d; - - void (*set_pgd)(pgd_t *pgdp, pgd_t pgdval); -#endif /* CONFIG_PGTABLE_LEVELS >= 5 */ - -#endif /* CONFIG_PGTABLE_LEVELS >= 4 */ - -#endif /* CONFIG_PGTABLE_LEVELS >= 3 */ - - struct pv_lazy_ops lazy_mode; - - /* dom0 ops */ - - /* Sometimes the physical address is a pfn, and sometimes its - an mfn. We can tell which is which from the index. */ - void (*set_fixmap)(unsigned /* enum fixed_addresses */ idx, - phys_addr_t phys, pgprot_t flags); }; struct arch_spinlock; @@ -260,6 +152,7 @@ struct paravirt_patch_template { #ifdef CONFIG_PARAVIRT_FULL struct pvfull_cpu_ops pvfull_cpu_ops; struct pvfull_irq_ops pvfull_irq_ops; + struct pvfull_mmu_ops pvfull_mmu_ops; #endif }; @@ -599,14 +492,6 @@ enum paravirt_lazy_mode { PARAVIRT_LAZY_CPU, }; -enum paravirt_lazy_mode paravirt_get_lazy_mode(void); -void paravirt_start_context_switch(struct task_struct *prev); -void paravirt_end_context_switch(struct task_struct *next); - -void paravirt_enter_lazy_mmu(void); -void paravirt_leave_lazy_mmu(void); -void paravirt_flush_lazy_mmu(void); - void _paravirt_nop(void); u32 _paravirt_ident_32(u32); u64 _paravirt_ident_64(u64); diff --git a/arch/x86/include/asm/paravirt_types_full.h b/arch/x86/include/asm/paravirt_types_full.h index eabc0ecec8e4..15d595a5f9d2 100644 --- a/arch/x86/include/asm/paravirt_types_full.h +++ b/arch/x86/include/asm/paravirt_types_full.h @@ -1,6 +1,13 @@ #ifndef _ASM_X86_PARAVIRT_TYPES_FULL_H #define _ASM_X86_PARAVIRT_TYPES_FULL_H +struct pv_lazy_ops { + /* Set deferred update mode, used for batching operations. */ + void (*enter)(void); + void (*leave)(void); + void (*flush)(void); +}; + struct pvfull_cpu_ops { /* hooks for various privileged instructions */ unsigned long (*get_debugreg)(int regno); @@ -86,7 +93,116 @@ struct pvfull_irq_ops { #endif }; +struct pvfull_mmu_ops { + unsigned long (*read_cr2)(void); + void (*write_cr2)(unsigned long); + + unsigned long (*read_cr3)(void); + void (*write_cr3)(unsigned long); + + /* + * Hooks for intercepting the creation/use/destruction of an + * mm_struct. + */ + void (*activate_mm)(struct mm_struct *prev, + struct mm_struct *next); + void (*dup_mmap)(struct mm_struct *oldmm, + struct mm_struct *mm); + + /* TLB operations */ + void (*flush_tlb_user)(void); + void (*flush_tlb_kernel)(void); + void (*flush_tlb_single)(unsigned long addr); + + /* Hooks for allocating and freeing a pagetable top-level */ + int (*pgd_alloc)(struct mm_struct *mm); + void (*pgd_free)(struct mm_struct *mm, pgd_t *pgd); + + /* + * Hooks for allocating/releasing pagetable pages when they're + * attached to a pagetable + */ + void (*alloc_pte)(struct mm_struct *mm, unsigned long pfn); + void (*alloc_pmd)(struct mm_struct *mm, unsigned long pfn); + void (*alloc_pud)(struct mm_struct *mm, unsigned long pfn); + void (*alloc_p4d)(struct mm_struct *mm, unsigned long pfn); + void (*release_pte)(unsigned long pfn); + void (*release_pmd)(unsigned long pfn); + void (*release_pud)(unsigned long pfn); + void (*release_p4d)(unsigned long pfn); + + /* Pagetable manipulation functions */ + void (*set_pte)(pte_t *ptep, pte_t pteval); + void (*set_pte_at)(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pteval); + void (*set_pmd)(pmd_t *pmdp, pmd_t pmdval); + void (*set_pmd_at)(struct mm_struct *mm, unsigned long addr, + pmd_t *pmdp, pmd_t pmdval); + void (*set_pud_at)(struct mm_struct *mm, unsigned long addr, + pud_t *pudp, pud_t pudval); + void (*pte_update)(struct mm_struct *mm, unsigned long addr, + pte_t *ptep); + + pte_t (*ptep_modify_prot_start)(struct mm_struct *mm, unsigned long addr, + pte_t *ptep); + void (*ptep_modify_prot_commit)(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte); + + struct paravirt_callee_save pte_val; + struct paravirt_callee_save make_pte; + + struct paravirt_callee_save pgd_val; + struct paravirt_callee_save make_pgd; + +#if CONFIG_PGTABLE_LEVELS >= 3 +#ifdef CONFIG_X86_PAE + void (*set_pte_atomic)(pte_t *ptep, pte_t pteval); + void (*pte_clear)(struct mm_struct *mm, unsigned long addr, + pte_t *ptep); + void (*pmd_clear)(pmd_t *pmdp); + +#endif /* CONFIG_X86_PAE */ + + void (*set_pud)(pud_t *pudp, pud_t pudval); + + struct paravirt_callee_save pmd_val; + struct paravirt_callee_save make_pmd; + +#if CONFIG_PGTABLE_LEVELS >= 4 + struct paravirt_callee_save pud_val; + struct paravirt_callee_save make_pud; + + void (*set_p4d)(p4d_t *p4dp, p4d_t p4dval); + +#if CONFIG_PGTABLE_LEVELS >= 5 + struct paravirt_callee_save p4d_val; + struct paravirt_callee_save make_p4d; + + void (*set_pgd)(pgd_t *pgdp, pgd_t pgdval); +#endif /* CONFIG_PGTABLE_LEVELS >= 5 */ + +#endif /* CONFIG_PGTABLE_LEVELS >= 4 */ + +#endif /* CONFIG_PGTABLE_LEVELS >= 3 */ + + struct pv_lazy_ops lazy_mode; + + /* Sometimes the physical address is a pfn, and sometimes its + an mfn. We can tell which is which from the index. */ + void (*set_fixmap)(unsigned /* enum fixed_addresses */ idx, + phys_addr_t phys, pgprot_t flags); +}; + extern struct pvfull_cpu_ops pvfull_cpu_ops; extern struct pvfull_irq_ops pvfull_irq_ops; +extern struct pvfull_mmu_ops pvfull_mmu_ops; + +enum paravirt_lazy_mode paravirt_get_lazy_mode(void); +void paravirt_start_context_switch(struct task_struct *prev); +void paravirt_end_context_switch(struct task_struct *next); + +void paravirt_enter_lazy_mmu(void); +void paravirt_leave_lazy_mmu(void); +void paravirt_flush_lazy_mmu(void); #endif /* _ASM_X86_PARAVIRT_TYPES_FULL_H */ diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h index 71de65bb1791..5cff39bd7f6d 100644 --- a/arch/x86/include/asm/pgalloc.h +++ b/arch/x86/include/asm/pgalloc.h @@ -7,7 +7,7 @@ static inline int __paravirt_pgd_alloc(struct mm_struct *mm) { return 0; } -#ifdef CONFIG_PARAVIRT +#ifdef CONFIG_PARAVIRT_FULL #include <asm/paravirt.h> #else #define paravirt_pgd_alloc(mm) __paravirt_pgd_alloc(mm) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index fad12c481bf9..60c8f2ac7fee 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -38,9 +38,9 @@ extern struct list_head pgd_list; extern struct mm_struct *pgd_page_get_mm(struct page *page); -#ifdef CONFIG_PARAVIRT +#ifdef CONFIG_PARAVIRT_FULL #include <asm/paravirt.h> -#else /* !CONFIG_PARAVIRT */ +#else /* !CONFIG_PARAVIRT_FULL */ #define set_pte(ptep, pte) native_set_pte(ptep, pte) #define set_pte_at(mm, addr, ptep, pte) native_set_pte_at(mm, addr, ptep, pte) #define set_pmd_at(mm, addr, pmdp, pmd) native_set_pmd_at(mm, addr, pmdp, pmd) @@ -98,10 +98,6 @@ extern struct mm_struct *pgd_page_get_mm(struct page *page); #define pte_val(x) native_pte_val(x) #define __pte(x) native_make_pte(x) -#endif /* CONFIG_PARAVIRT */ - -#ifndef CONFIG_PARAVIRT_FULL - #define arch_end_context_switch(prev) do {} while (0) #endif /* CONFIG_PARAVIRT_FULL */ diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index ca3a3103791d..1ad38e40a770 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -135,7 +135,7 @@ static inline void native_wbinvd(void) extern asmlinkage void native_load_gs_index(unsigned); -#ifdef CONFIG_PARAVIRT +#ifdef CONFIG_PARAVIRT_FULL #include <asm/paravirt.h> #else @@ -159,10 +159,6 @@ static inline void write_cr3(unsigned long x) native_write_cr3(x); } -#endif/* CONFIG_PARAVIRT */ - -#ifndef CONFIG_PARAVIRT_FULL - static inline unsigned long read_cr0(void) { return native_read_cr0(); diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 6ed9ea469b48..6b0b6a1f231f 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -56,7 +56,7 @@ static inline void invpcid_flush_all_nonglobals(void) __invpcid(0, 0, INVPCID_TYPE_ALL_NON_GLOBAL); } -#ifdef CONFIG_PARAVIRT +#ifdef CONFIG_PARAVIRT_FULL #include <asm/paravirt.h> #else #define __flush_tlb() __native_flush_tlb() diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c index a32148390e49..18a5c06c007a 100644 --- a/arch/x86/kernel/asm-offsets.c +++ b/arch/x86/kernel/asm-offsets.c @@ -67,15 +67,17 @@ void common(void) { OFFSET(PARAVIRT_PATCH_pv_irq_ops, paravirt_patch_template, pv_irq_ops); OFFSET(PV_IRQ_irq_disable, pv_irq_ops, irq_disable); OFFSET(PV_IRQ_irq_enable, pv_irq_ops, irq_enable); - OFFSET(PV_MMU_read_cr2, pv_mmu_ops, read_cr2); #endif #ifdef CONFIG_PARAVIRT_FULL OFFSET(PARAVIRT_PATCH_pvfull_cpu_ops, paravirt_patch_template, pvfull_cpu_ops); OFFSET(PARAVIRT_PATCH_pvfull_irq_ops, paravirt_patch_template, pvfull_irq_ops); + OFFSET(PARAVIRT_PATCH_pvfull_mmu_ops, paravirt_patch_template, + pvfull_mmu_ops); OFFSET(PV_CPU_iret, pvfull_cpu_ops, iret); OFFSET(PV_CPU_read_cr0, pvfull_cpu_ops, read_cr0); + OFFSET(PV_MMU_read_cr2, pvfull_mmu_ops, read_cr2); #endif #ifdef CONFIG_XEN diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index ac9d327d2e42..f004edaf0d1f 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -23,7 +23,7 @@ #include "../entry/calling.h" #include <asm/export.h> -#ifdef CONFIG_PARAVIRT +#ifdef CONFIG_PARAVIRT_FULL #include <asm/asm-offsets.h> #include <asm/paravirt.h> #define GET_CR2_INTO(reg) GET_CR2_INTO_RAX ; movq %rax, reg diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index 8e22cfc73349..6fb642572bff 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -28,12 +28,9 @@ #include <asm/bug.h> #include <asm/paravirt.h> #include <asm/setup.h> -#include <asm/pgtable.h> #include <asm/time.h> -#include <asm/pgalloc.h> #include <asm/irq.h> #include <asm/delay.h> -#include <asm/fixmap.h> #include <asm/apic.h> #include <asm/tlbflush.h> #include <asm/timer.h> @@ -179,25 +176,6 @@ unsigned paravirt_patch_insns(void *insnbuf, unsigned len, return insn_len; } -static void native_flush_tlb(void) -{ - __native_flush_tlb(); -} - -/* - * Global pages have to be flushed a bit differently. Not a real - * performance problem because this does not happen often. - */ -static void native_flush_tlb_global(void) -{ - __native_flush_tlb_global(); -} - -static void native_flush_tlb_single(unsigned long addr) -{ - __native_flush_tlb_single(addr); -} - struct static_key paravirt_steal_enabled; struct static_key paravirt_steal_rq_enabled; @@ -206,73 +184,6 @@ static u64 native_steal_clock(int cpu) return 0; } -static DEFINE_PER_CPU(enum paravirt_lazy_mode, paravirt_lazy_mode) = PARAVIRT_LAZY_NONE; - -static inline void enter_lazy(enum paravirt_lazy_mode mode) -{ - BUG_ON(this_cpu_read(paravirt_lazy_mode) != PARAVIRT_LAZY_NONE); - - this_cpu_write(paravirt_lazy_mode, mode); -} - -static void leave_lazy(enum paravirt_lazy_mode mode) -{ - BUG_ON(this_cpu_read(paravirt_lazy_mode) != mode); - - this_cpu_write(paravirt_lazy_mode, PARAVIRT_LAZY_NONE); -} - -void paravirt_enter_lazy_mmu(void) -{ - enter_lazy(PARAVIRT_LAZY_MMU); -} - -void paravirt_leave_lazy_mmu(void) -{ - leave_lazy(PARAVIRT_LAZY_MMU); -} - -void paravirt_flush_lazy_mmu(void) -{ - preempt_disable(); - - if (paravirt_get_lazy_mode() == PARAVIRT_LAZY_MMU) { - arch_leave_lazy_mmu_mode(); - arch_enter_lazy_mmu_mode(); - } - - preempt_enable(); -} - -void paravirt_start_context_switch(struct task_struct *prev) -{ - BUG_ON(preemptible()); - - if (this_cpu_read(paravirt_lazy_mode) == PARAVIRT_LAZY_MMU) { - arch_leave_lazy_mmu_mode(); - set_ti_thread_flag(task_thread_info(prev), TIF_LAZY_MMU_UPDATES); - } - enter_lazy(PARAVIRT_LAZY_CPU); -} - -void paravirt_end_context_switch(struct task_struct *next) -{ - BUG_ON(preemptible()); - - leave_lazy(PARAVIRT_LAZY_CPU); - - if (test_and_clear_ti_thread_flag(task_thread_info(next), TIF_LAZY_MMU_UPDATES)) - arch_enter_lazy_mmu_mode(); -} - -enum paravirt_lazy_mode paravirt_get_lazy_mode(void) -{ - if (in_interrupt()) - return PARAVIRT_LAZY_NONE; - - return this_cpu_read(paravirt_lazy_mode); -} - struct pv_info pv_info = { .name = "bare hardware", .kernel_rpl = 0, @@ -303,91 +214,9 @@ __visible struct pv_cpu_ops pv_cpu_ops = { .io_delay = native_io_delay, }; -#if defined(CONFIG_X86_32) && !defined(CONFIG_X86_PAE) -/* 32-bit pagetable entries */ -#define PTE_IDENT __PV_IS_CALLEE_SAVE(_paravirt_ident_32) -#else -/* 64-bit pagetable entries */ -#define PTE_IDENT __PV_IS_CALLEE_SAVE(_paravirt_ident_64) -#endif - struct pv_mmu_ops pv_mmu_ops __ro_after_init = { - - .read_cr2 = native_read_cr2, - .write_cr2 = native_write_cr2, - .read_cr3 = native_read_cr3, - .write_cr3 = native_write_cr3, - - .flush_tlb_user = native_flush_tlb, - .flush_tlb_kernel = native_flush_tlb_global, - .flush_tlb_single = native_flush_tlb_single, .flush_tlb_others = native_flush_tlb_others, - - .pgd_alloc = __paravirt_pgd_alloc, - .pgd_free = paravirt_nop, - - .alloc_pte = paravirt_nop, - .alloc_pmd = paravirt_nop, - .alloc_pud = paravirt_nop, - .alloc_p4d = paravirt_nop, - .release_pte = paravirt_nop, - .release_pmd = paravirt_nop, - .release_pud = paravirt_nop, - .release_p4d = paravirt_nop, - - .set_pte = native_set_pte, - .set_pte_at = native_set_pte_at, - .set_pmd = native_set_pmd, - .set_pmd_at = native_set_pmd_at, - .pte_update = paravirt_nop, - - .ptep_modify_prot_start = __ptep_modify_prot_start, - .ptep_modify_prot_commit = __ptep_modify_prot_commit, - -#if CONFIG_PGTABLE_LEVELS >= 3 -#ifdef CONFIG_X86_PAE - .set_pte_atomic = native_set_pte_atomic, - .pte_clear = native_pte_clear, - .pmd_clear = native_pmd_clear, -#endif - .set_pud = native_set_pud, - .set_pud_at = native_set_pud_at, - - .pmd_val = PTE_IDENT, - .make_pmd = PTE_IDENT, - -#if CONFIG_PGTABLE_LEVELS >= 4 - .pud_val = PTE_IDENT, - .make_pud = PTE_IDENT, - - .set_p4d = native_set_p4d, - -#if CONFIG_PGTABLE_LEVELS >= 5 - .p4d_val = PTE_IDENT, - .make_p4d = PTE_IDENT, - - .set_pgd = native_set_pgd, -#endif /* CONFIG_PGTABLE_LEVELS >= 5 */ -#endif /* CONFIG_PGTABLE_LEVELS >= 4 */ -#endif /* CONFIG_PGTABLE_LEVELS >= 3 */ - - .pte_val = PTE_IDENT, - .pgd_val = PTE_IDENT, - - .make_pte = PTE_IDENT, - .make_pgd = PTE_IDENT, - - .dup_mmap = paravirt_nop, .exit_mmap = paravirt_nop, - .activate_mm = paravirt_nop, - - .lazy_mode = { - .enter = paravirt_nop, - .leave = paravirt_nop, - .flush = paravirt_nop, - }, - - .set_fixmap = native_set_fixmap, }; EXPORT_SYMBOL_GPL(pv_time_ops); diff --git a/arch/x86/kernel/paravirt_full.c b/arch/x86/kernel/paravirt_full.c index 353968da3ddc..b90dfa7428bd 100644 --- a/arch/x86/kernel/paravirt_full.c +++ b/arch/x86/kernel/paravirt_full.c @@ -19,12 +19,103 @@ #include <asm/paravirt.h> #include <asm/debugreg.h> #include <asm/desc.h> +#include <asm/pgalloc.h> #include <asm/processor.h> +#include <asm/tlbflush.h> /* These are in entry.S */ extern void native_iret(void); extern void native_usergs_sysret64(void); +static DEFINE_PER_CPU(enum paravirt_lazy_mode, paravirt_lazy_mode) + PARAVIRT_LAZY_NONE; + +static inline void enter_lazy(enum paravirt_lazy_mode mode) +{ + BUG_ON(this_cpu_read(paravirt_lazy_mode) != PARAVIRT_LAZY_NONE); + + this_cpu_write(paravirt_lazy_mode, mode); +} + +static void leave_lazy(enum paravirt_lazy_mode mode) +{ + BUG_ON(this_cpu_read(paravirt_lazy_mode) != mode); + + this_cpu_write(paravirt_lazy_mode, PARAVIRT_LAZY_NONE); +} + +void paravirt_enter_lazy_mmu(void) +{ + enter_lazy(PARAVIRT_LAZY_MMU); +} + +void paravirt_leave_lazy_mmu(void) +{ + leave_lazy(PARAVIRT_LAZY_MMU); +} + +void paravirt_flush_lazy_mmu(void) +{ + preempt_disable(); + + if (paravirt_get_lazy_mode() == PARAVIRT_LAZY_MMU) { + arch_leave_lazy_mmu_mode(); + arch_enter_lazy_mmu_mode(); + } + + preempt_enable(); +} + +void paravirt_start_context_switch(struct task_struct *prev) +{ + BUG_ON(preemptible()); + + if (this_cpu_read(paravirt_lazy_mode) == PARAVIRT_LAZY_MMU) { + arch_leave_lazy_mmu_mode(); + set_ti_thread_flag(task_thread_info(prev), + TIF_LAZY_MMU_UPDATES); + } + enter_lazy(PARAVIRT_LAZY_CPU); +} + +void paravirt_end_context_switch(struct task_struct *next) +{ + BUG_ON(preemptible()); + + leave_lazy(PARAVIRT_LAZY_CPU); + + if (test_and_clear_ti_thread_flag(task_thread_info(next), + TIF_LAZY_MMU_UPDATES)) + arch_enter_lazy_mmu_mode(); +} + +enum paravirt_lazy_mode paravirt_get_lazy_mode(void) +{ + if (in_interrupt()) + return PARAVIRT_LAZY_NONE; + + return this_cpu_read(paravirt_lazy_mode); +} + +static void native_flush_tlb(void) +{ + __native_flush_tlb(); +} + +/* + * Global pages have to be flushed a bit differently. Not a real + * performance problem because this does not happen often. + */ +static void native_flush_tlb_global(void) +{ + __native_flush_tlb_global(); +} + +static void native_flush_tlb_single(unsigned long addr) +{ + __native_flush_tlb_single(addr); +} + __visible struct pvfull_cpu_ops pvfull_cpu_ops = { .cpuid = native_cpuid, .get_debugreg = native_get_debugreg, @@ -82,6 +173,90 @@ __visible struct pvfull_irq_ops pvfull_irq_ops = { #endif }; +#if defined(CONFIG_X86_32) && !defined(CONFIG_X86_PAE) +/* 32-bit pagetable entries */ +#define PTE_IDENT __PV_IS_CALLEE_SAVE(_paravirt_ident_32) +#else +/* 64-bit pagetable entries */ +#define PTE_IDENT __PV_IS_CALLEE_SAVE(_paravirt_ident_64) +#endif + +struct pvfull_mmu_ops pvfull_mmu_ops = { + .read_cr2 = native_read_cr2, + .write_cr2 = native_write_cr2, + .read_cr3 = native_read_cr3, + .write_cr3 = native_write_cr3, + + .flush_tlb_user = native_flush_tlb, + .flush_tlb_kernel = native_flush_tlb_global, + .flush_tlb_single = native_flush_tlb_single, + + .pgd_alloc = __paravirt_pgd_alloc, + .pgd_free = paravirt_nop, + + .alloc_pte = paravirt_nop, + .alloc_pmd = paravirt_nop, + .alloc_pud = paravirt_nop, + .alloc_p4d = paravirt_nop, + .release_pte = paravirt_nop, + .release_pmd = paravirt_nop, + .release_pud = paravirt_nop, + .release_p4d = paravirt_nop, + + .set_pte = native_set_pte, + .set_pte_at = native_set_pte_at, + .set_pmd = native_set_pmd, + .set_pmd_at = native_set_pmd_at, + .pte_update = paravirt_nop, + + .ptep_modify_prot_start = __ptep_modify_prot_start, + .ptep_modify_prot_commit = __ptep_modify_prot_commit, + +#if CONFIG_PGTABLE_LEVELS >= 3 +#ifdef CONFIG_X86_PAE + .set_pte_atomic = native_set_pte_atomic, + .pte_clear = native_pte_clear, + .pmd_clear = native_pmd_clear, +#endif + .set_pud = native_set_pud, + .set_pud_at = native_set_pud_at, + + .pmd_val = PTE_IDENT, + .make_pmd = PTE_IDENT, + +#if CONFIG_PGTABLE_LEVELS >= 4 + .pud_val = PTE_IDENT, + .make_pud = PTE_IDENT, + + .set_p4d = native_set_p4d, + +#if CONFIG_PGTABLE_LEVELS >= 5 + .p4d_val = PTE_IDENT, + .make_p4d = PTE_IDENT, + + .set_pgd = native_set_pgd, +#endif /* CONFIG_PGTABLE_LEVELS >= 5 */ +#endif /* CONFIG_PGTABLE_LEVELS >= 4 */ +#endif /* CONFIG_PGTABLE_LEVELS >= 3 */ + + .pte_val = PTE_IDENT, + .pgd_val = PTE_IDENT, + + .make_pte = PTE_IDENT, + .make_pgd = PTE_IDENT, + + .dup_mmap = paravirt_nop, + .activate_mm = paravirt_nop, + + .lazy_mode = { + .enter = paravirt_nop, + .leave = paravirt_nop, + .flush = paravirt_nop, + }, + + .set_fixmap = native_set_fixmap, +}; + /* At this point, native_get/set_debugreg has real function entries */ NOKPROBE_SYMBOL(native_get_debugreg); NOKPROBE_SYMBOL(native_set_debugreg); @@ -89,3 +264,4 @@ NOKPROBE_SYMBOL(native_load_idt); EXPORT_SYMBOL(pvfull_cpu_ops); EXPORT_SYMBOL_GPL(pvfull_irq_ops); +EXPORT_SYMBOL(pvfull_mmu_ops); diff --git a/arch/x86/kernel/paravirt_patch_32.c b/arch/x86/kernel/paravirt_patch_32.c index ccb75951aed5..b5f93cb0d05f 100644 --- a/arch/x86/kernel/paravirt_patch_32.c +++ b/arch/x86/kernel/paravirt_patch_32.c @@ -4,10 +4,10 @@ DEF_NATIVE(pv_irq_ops, irq_disable, "cli"); DEF_NATIVE(pv_irq_ops, irq_enable, "sti"); DEF_NATIVE(pv_irq_ops, restore_fl, "push %eax; popf"); DEF_NATIVE(pv_irq_ops, save_fl, "pushf; pop %eax"); -DEF_NATIVE(pv_mmu_ops, read_cr2, "mov %cr2, %eax"); -DEF_NATIVE(pv_mmu_ops, write_cr3, "mov %eax, %cr3"); -DEF_NATIVE(pv_mmu_ops, read_cr3, "mov %cr3, %eax"); #ifdef CONFIG_PARAVIRT_FULL +DEF_NATIVE(pvfull_mmu_ops, read_cr2, "mov %cr2, %eax"); +DEF_NATIVE(pvfull_mmu_ops, write_cr3, "mov %eax, %cr3"); +DEF_NATIVE(pvfull_mmu_ops, read_cr3, "mov %cr3, %eax"); DEF_NATIVE(pvfull_cpu_ops, iret, "iret"); #endif @@ -47,10 +47,10 @@ unsigned native_patch(u8 type, u16 clobbers, void *ibuf, PATCH_SITE(pv_irq_ops, irq_enable); PATCH_SITE(pv_irq_ops, restore_fl); PATCH_SITE(pv_irq_ops, save_fl); - PATCH_SITE(pv_mmu_ops, read_cr2); - PATCH_SITE(pv_mmu_ops, read_cr3); - PATCH_SITE(pv_mmu_ops, write_cr3); #ifdef CONFIG_PARAVIRT_FULL + PATCH_SITE(pvfull_mmu_ops, read_cr2); + PATCH_SITE(pvfull_mmu_ops, read_cr3); + PATCH_SITE(pvfull_mmu_ops, write_cr3); PATCH_SITE(pvfull_cpu_ops, iret); #endif #if defined(CONFIG_PARAVIRT_SPINLOCKS) diff --git a/arch/x86/kernel/paravirt_patch_64.c b/arch/x86/kernel/paravirt_patch_64.c index 00d5c77d23a7..473688054f0b 100644 --- a/arch/x86/kernel/paravirt_patch_64.c +++ b/arch/x86/kernel/paravirt_patch_64.c @@ -6,15 +6,15 @@ DEF_NATIVE(pv_irq_ops, irq_disable, "cli"); DEF_NATIVE(pv_irq_ops, irq_enable, "sti"); DEF_NATIVE(pv_irq_ops, restore_fl, "pushq %rdi; popfq"); DEF_NATIVE(pv_irq_ops, save_fl, "pushfq; popq %rax"); -DEF_NATIVE(pv_mmu_ops, read_cr2, "movq %cr2, %rax"); -DEF_NATIVE(pv_mmu_ops, read_cr3, "movq %cr3, %rax"); -DEF_NATIVE(pv_mmu_ops, write_cr3, "movq %rdi, %cr3"); -DEF_NATIVE(pv_mmu_ops, flush_tlb_single, "invlpg (%rdi)"); DEF_NATIVE(, mov32, "mov %edi, %eax"); DEF_NATIVE(, mov64, "mov %rdi, %rax"); #ifdef CONFIG_PARAVIRT_FULL +DEF_NATIVE(pvfull_mmu_ops, read_cr2, "movq %cr2, %rax"); +DEF_NATIVE(pvfull_mmu_ops, read_cr3, "movq %cr3, %rax"); +DEF_NATIVE(pvfull_mmu_ops, write_cr3, "movq %rdi, %cr3"); +DEF_NATIVE(pvfull_mmu_ops, flush_tlb_single, "invlpg (%rdi)"); DEF_NATIVE(pvfull_cpu_ops, wbinvd, "wbinvd"); DEF_NATIVE(pvfull_cpu_ops, usergs_sysret64, "swapgs; sysretq"); DEF_NATIVE(pvfull_cpu_ops, swapgs, "swapgs"); @@ -56,11 +56,11 @@ unsigned native_patch(u8 type, u16 clobbers, void *ibuf, PATCH_SITE(pv_irq_ops, save_fl); PATCH_SITE(pv_irq_ops, irq_enable); PATCH_SITE(pv_irq_ops, irq_disable); - PATCH_SITE(pv_mmu_ops, read_cr2); - PATCH_SITE(pv_mmu_ops, read_cr3); - PATCH_SITE(pv_mmu_ops, write_cr3); - PATCH_SITE(pv_mmu_ops, flush_tlb_single); #ifdef CONFIG_PARAVIRT_FULL + PATCH_SITE(pvfull_mmu_ops, read_cr2); + PATCH_SITE(pvfull_mmu_ops, read_cr3); + PATCH_SITE(pvfull_mmu_ops, write_cr3); + PATCH_SITE(pvfull_mmu_ops, flush_tlb_single); PATCH_SITE(pvfull_cpu_ops, usergs_sysret64); PATCH_SITE(pvfull_cpu_ops, swapgs); PATCH_SITE(pvfull_cpu_ops, wbinvd); diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c index bf8773854ab0..b9757853cf79 100644 --- a/arch/x86/lguest/boot.c +++ b/arch/x86/lguest/boot.c @@ -753,7 +753,7 @@ static void lguest_pmd_clear(pmd_t *pmdp) #endif /* - * Unfortunately for Lguest, the pv_mmu_ops for page tables were based on + * Unfortunately for Lguest, the pvfull_mmu_ops for page tables were based on * native page table operations. On native hardware you can set a new page * table entry whenever you want, but if you want to remove one you have to do * a TLB flush (a TLB is a little cache of page table entries kept by the CPU). @@ -1431,25 +1431,25 @@ __init void lguest_init(void) pvfull_cpu_ops.end_context_switch = lguest_end_context_switch; /* Pagetable management */ - pv_mmu_ops.write_cr3 = lguest_write_cr3; - pv_mmu_ops.flush_tlb_user = lguest_flush_tlb_user; - pv_mmu_ops.flush_tlb_single = lguest_flush_tlb_single; - pv_mmu_ops.flush_tlb_kernel = lguest_flush_tlb_kernel; - pv_mmu_ops.set_pte = lguest_set_pte; - pv_mmu_ops.set_pte_at = lguest_set_pte_at; - pv_mmu_ops.set_pmd = lguest_set_pmd; + pvfull_mmu_ops.write_cr3 = lguest_write_cr3; + pvfull_mmu_ops.flush_tlb_user = lguest_flush_tlb_user; + pvfull_mmu_ops.flush_tlb_single = lguest_flush_tlb_single; + pvfull_mmu_ops.flush_tlb_kernel = lguest_flush_tlb_kernel; + pvfull_mmu_ops.set_pte = lguest_set_pte; + pvfull_mmu_ops.set_pte_at = lguest_set_pte_at; + pvfull_mmu_ops.set_pmd = lguest_set_pmd; #ifdef CONFIG_X86_PAE - pv_mmu_ops.set_pte_atomic = lguest_set_pte_atomic; - pv_mmu_ops.pte_clear = lguest_pte_clear; - pv_mmu_ops.pmd_clear = lguest_pmd_clear; - pv_mmu_ops.set_pud = lguest_set_pud; + pvfull_mmu_ops.set_pte_atomic = lguest_set_pte_atomic; + pvfull_mmu_ops.pte_clear = lguest_pte_clear; + pvfull_mmu_ops.pmd_clear = lguest_pmd_clear; + pvfull_mmu_ops.set_pud = lguest_set_pud; #endif - pv_mmu_ops.read_cr2 = lguest_read_cr2; - pv_mmu_ops.read_cr3 = lguest_read_cr3; - pv_mmu_ops.lazy_mode.enter = paravirt_enter_lazy_mmu; - pv_mmu_ops.lazy_mode.leave = lguest_leave_lazy_mmu_mode; - pv_mmu_ops.lazy_mode.flush = paravirt_flush_lazy_mmu; - pv_mmu_ops.pte_update = lguest_pte_update; + pvfull_mmu_ops.read_cr2 = lguest_read_cr2; + pvfull_mmu_ops.read_cr3 = lguest_read_cr3; + pvfull_mmu_ops.lazy_mode.enter = paravirt_enter_lazy_mmu; + pvfull_mmu_ops.lazy_mode.leave = lguest_leave_lazy_mmu_mode; + pvfull_mmu_ops.lazy_mode.flush = paravirt_flush_lazy_mmu; + pvfull_mmu_ops.pte_update = lguest_pte_update; #ifdef CONFIG_X86_LOCAL_APIC /* APIC read/write intercepts */ diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c index 89cd5cc5f1a2..9badad9f82e0 100644 --- a/arch/x86/xen/enlighten_pv.c +++ b/arch/x86/xen/enlighten_pv.c @@ -1002,7 +1002,7 @@ void xen_setup_vcpu_info_placement(void) pv_irq_ops.restore_fl = __PV_IS_CALLEE_SAVE(xen_restore_fl_direct); pv_irq_ops.irq_disable = __PV_IS_CALLEE_SAVE(xen_irq_disable_direct); pv_irq_ops.irq_enable = __PV_IS_CALLEE_SAVE(xen_irq_enable_direct); - pv_mmu_ops.read_cr2 = xen_read_cr2_direct; + pvfull_mmu_ops.read_cr2 = xen_read_cr2_direct; } } @@ -1316,8 +1316,10 @@ asmlinkage __visible void __init xen_start_kernel(void) #endif if (xen_feature(XENFEAT_mmu_pt_update_preserve_ad)) { - pv_mmu_ops.ptep_modify_prot_start = xen_ptep_modify_prot_start; - pv_mmu_ops.ptep_modify_prot_commit = xen_ptep_modify_prot_commit; + pvfull_mmu_ops.ptep_modify_prot_start + xen_ptep_modify_prot_start; + pvfull_mmu_ops.ptep_modify_prot_commit + xen_ptep_modify_prot_commit; } machine_ops = xen_machine_ops; diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index 7397d8b8459d..7be3e21a4dac 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -2252,7 +2252,7 @@ static void __init xen_write_cr3_init(unsigned long cr3) set_page_prot(initial_page_table, PAGE_KERNEL); set_page_prot(initial_kernel_pmd, PAGE_KERNEL); - pv_mmu_ops.write_cr3 = &xen_write_cr3; + pvfull_mmu_ops.write_cr3 = &xen_write_cr3; } /* @@ -2406,27 +2406,27 @@ static void __init xen_post_allocator_init(void) if (xen_feature(XENFEAT_auto_translated_physmap)) return; - pv_mmu_ops.set_pte = xen_set_pte; - pv_mmu_ops.set_pmd = xen_set_pmd; - pv_mmu_ops.set_pud = xen_set_pud; + pvfull_mmu_ops.set_pte = xen_set_pte; + pvfull_mmu_ops.set_pmd = xen_set_pmd; + pvfull_mmu_ops.set_pud = xen_set_pud; #if CONFIG_PGTABLE_LEVELS >= 4 - pv_mmu_ops.set_p4d = xen_set_p4d; + pvfull_mmu_ops.set_p4d = xen_set_p4d; #endif /* This will work as long as patching hasn't happened yet (which it hasn't) */ - pv_mmu_ops.alloc_pte = xen_alloc_pte; - pv_mmu_ops.alloc_pmd = xen_alloc_pmd; - pv_mmu_ops.release_pte = xen_release_pte; - pv_mmu_ops.release_pmd = xen_release_pmd; + pvfull_mmu_ops.alloc_pte = xen_alloc_pte; + pvfull_mmu_ops.alloc_pmd = xen_alloc_pmd; + pvfull_mmu_ops.release_pte = xen_release_pte; + pvfull_mmu_ops.release_pmd = xen_release_pmd; #if CONFIG_PGTABLE_LEVELS >= 4 - pv_mmu_ops.alloc_pud = xen_alloc_pud; - pv_mmu_ops.release_pud = xen_release_pud; + pvfull_mmu_ops.alloc_pud = xen_alloc_pud; + pvfull_mmu_ops.release_pud = xen_release_pud; #endif - pv_mmu_ops.make_pte = PV_CALLEE_SAVE(xen_make_pte); + pvfull_mmu_ops.make_pte = PV_CALLEE_SAVE(xen_make_pte); #ifdef CONFIG_X86_64 - pv_mmu_ops.write_cr3 = &xen_write_cr3; + pvfull_mmu_ops.write_cr3 = &xen_write_cr3; SetPagePinned(virt_to_page(level3_user_vsyscall)); #endif xen_mark_init_mm_pinned(); @@ -2440,7 +2440,7 @@ static void xen_leave_lazy_mmu(void) preempt_enable(); } -static const struct pv_mmu_ops xen_mmu_ops __initconst = { +static const struct pvfull_mmu_ops xen_mmu_ops __initconst = { .read_cr2 = xen_read_cr2, .write_cr2 = xen_write_cr2, @@ -2450,7 +2450,6 @@ static const struct pv_mmu_ops xen_mmu_ops __initconst = { .flush_tlb_user = xen_flush_tlb, .flush_tlb_kernel = xen_flush_tlb, .flush_tlb_single = xen_flush_tlb_single, - .flush_tlb_others = xen_flush_tlb_others, .pte_update = paravirt_nop, @@ -2496,7 +2495,6 @@ static const struct pv_mmu_ops xen_mmu_ops __initconst = { .activate_mm = xen_activate_mm, .dup_mmap = xen_dup_mmap, - .exit_mmap = xen_exit_mmap, .lazy_mode = { .enter = paravirt_enter_lazy_mmu, @@ -2514,7 +2512,9 @@ void __init xen_init_mmu_ops(void) if (xen_feature(XENFEAT_auto_translated_physmap)) return; - pv_mmu_ops = xen_mmu_ops; + pvfull_mmu_ops = xen_mmu_ops; + pv_mmu_ops.flush_tlb_others = xen_flush_tlb_others; + pv_mmu_ops.exit_mmap = xen_exit_mmap; memset(dummy_mapping, 0xff, PAGE_SIZE); } -- 2.12.0
Juergen Gross
2017-May-19 15:47 UTC
[PATCH 09/10] paravirt: split pv_info for support of PARAVIRT_FULL
Move members needed for fully paravirtualized guests only into a new structure pvfull_info in paravirt_types_full.h, paravirt_full.h and the associated vector into paravirt_full.c. Signed-off-by: Juergen Gross <jgross at suse.com> --- arch/x86/boot/compressed/misc.h | 1 + arch/x86/include/asm/paravirt.h | 2 -- arch/x86/include/asm/paravirt_full.h | 2 ++ arch/x86/include/asm/paravirt_types.h | 7 ------- arch/x86/include/asm/paravirt_types_full.h | 10 ++++++++++ arch/x86/include/asm/pgtable-3level_types.h | 4 ++-- arch/x86/include/asm/ptrace.h | 5 +++-- arch/x86/include/asm/segment.h | 2 +- arch/x86/kernel/paravirt.c | 9 ++------- arch/x86/kernel/paravirt_full.c | 10 ++++++++++ arch/x86/lguest/boot.c | 4 ++-- arch/x86/xen/enlighten_pv.c | 12 ++++++------ 12 files changed, 39 insertions(+), 29 deletions(-) diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h index 1c8355eadbd1..007b58f3d985 100644 --- a/arch/x86/boot/compressed/misc.h +++ b/arch/x86/boot/compressed/misc.h @@ -9,6 +9,7 @@ */ #undef CONFIG_PARAVIRT #undef CONFIG_PARAVIRT_SPINLOCKS +#undef CONFIG_PARAVIRT_FULL #undef CONFIG_KASAN #include <linux/linkage.h> diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index 3b9960a5de4a..55e0c1807df2 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -26,8 +26,6 @@ static inline enum paravirt_lazy_mode paravirt_get_lazy_mode(void) #endif -#define get_kernel_rpl() (pv_info.kernel_rpl) - static inline unsigned long long paravirt_sched_clock(void) { return PVOP_CALL0(unsigned long long, pv_time_ops.sched_clock); diff --git a/arch/x86/include/asm/paravirt_full.h b/arch/x86/include/asm/paravirt_full.h index 53f2eb436ba3..95d1c21bbef7 100644 --- a/arch/x86/include/asm/paravirt_full.h +++ b/arch/x86/include/asm/paravirt_full.h @@ -3,6 +3,8 @@ #ifndef __ASSEMBLY__ +#define get_kernel_rpl() (pvfull_info.kernel_rpl) + static inline void load_sp0(struct tss_struct *tss, struct thread_struct *thread) { diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h index b1ac2a5698b4..34753d10ebbc 100644 --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -66,13 +66,6 @@ struct paravirt_callee_save { /* general info */ struct pv_info { - unsigned int kernel_rpl; - int shared_kernel_pmd; - -#ifdef CONFIG_X86_64 - u16 extra_user_64bit_cs; /* __USER_CS if none */ -#endif - const char *name; }; diff --git a/arch/x86/include/asm/paravirt_types_full.h b/arch/x86/include/asm/paravirt_types_full.h index 15d595a5f9d2..b1f91fad5842 100644 --- a/arch/x86/include/asm/paravirt_types_full.h +++ b/arch/x86/include/asm/paravirt_types_full.h @@ -1,6 +1,15 @@ #ifndef _ASM_X86_PARAVIRT_TYPES_FULL_H #define _ASM_X86_PARAVIRT_TYPES_FULL_H +struct pvfull_info { + unsigned int kernel_rpl; + int shared_kernel_pmd; + +#ifdef CONFIG_X86_64 + u16 extra_user_64bit_cs; /* __USER_CS if none */ +#endif +}; + struct pv_lazy_ops { /* Set deferred update mode, used for batching operations. */ void (*enter)(void); @@ -193,6 +202,7 @@ struct pvfull_mmu_ops { phys_addr_t phys, pgprot_t flags); }; +extern struct pvfull_info pvfull_info; extern struct pvfull_cpu_ops pvfull_cpu_ops; extern struct pvfull_irq_ops pvfull_irq_ops; extern struct pvfull_mmu_ops pvfull_mmu_ops; diff --git a/arch/x86/include/asm/pgtable-3level_types.h b/arch/x86/include/asm/pgtable-3level_types.h index b8a4341faafa..fdf132570189 100644 --- a/arch/x86/include/asm/pgtable-3level_types.h +++ b/arch/x86/include/asm/pgtable-3level_types.h @@ -19,8 +19,8 @@ typedef union { } pte_t; #endif /* !__ASSEMBLY__ */ -#ifdef CONFIG_PARAVIRT -#define SHARED_KERNEL_PMD (pv_info.shared_kernel_pmd) +#ifdef CONFIG_PARAVIRT_FULL +#define SHARED_KERNEL_PMD (pvfull_info.shared_kernel_pmd) #else #define SHARED_KERNEL_PMD 1 #endif diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h index 2b5d686ea9f3..81d73663c497 100644 --- a/arch/x86/include/asm/ptrace.h +++ b/arch/x86/include/asm/ptrace.h @@ -118,7 +118,7 @@ static inline int v8086_mode(struct pt_regs *regs) #ifdef CONFIG_X86_64 static inline bool user_64bit_mode(struct pt_regs *regs) { -#ifndef CONFIG_PARAVIRT +#ifndef CONFIG_PARAVIRT_FULL /* * On non-paravirt systems, this is the only long mode CPL 3 * selector. We do not allow long mode selectors in the LDT. @@ -126,7 +126,8 @@ static inline bool user_64bit_mode(struct pt_regs *regs) return regs->cs == __USER_CS; #else /* Headers are too twisted for this to go in paravirt.h. */ - return regs->cs == __USER_CS || regs->cs == pv_info.extra_user_64bit_cs; + return regs->cs == __USER_CS || + regs->cs == pvfull_info.extra_user_64bit_cs; #endif } diff --git a/arch/x86/include/asm/segment.h b/arch/x86/include/asm/segment.h index 1549caa098f0..1c8f320934d1 100644 --- a/arch/x86/include/asm/segment.h +++ b/arch/x86/include/asm/segment.h @@ -210,7 +210,7 @@ #endif -#ifndef CONFIG_PARAVIRT +#ifndef CONFIG_PARAVIRT_FULL # define get_kernel_rpl() 0 #endif diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index 6fb642572bff..42da2fde1fef 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -125,6 +125,7 @@ static void *get_call_destination(u8 type) #ifdef CONFIG_PARAVIRT_FULL .pvfull_cpu_ops = pvfull_cpu_ops, .pvfull_irq_ops = pvfull_irq_ops, + .pvfull_mmu_ops = pvfull_mmu_ops, #endif }; return *((void **)&tmpl + type); @@ -186,12 +187,6 @@ static u64 native_steal_clock(int cpu) struct pv_info pv_info = { .name = "bare hardware", - .kernel_rpl = 0, - .shared_kernel_pmd = 1, /* Only used when CONFIG_X86_PAE is set */ - -#ifdef CONFIG_X86_64 - .extra_user_64bit_cs = __USER_CS, -#endif }; struct pv_init_ops pv_init_ops = { @@ -222,5 +217,5 @@ struct pv_mmu_ops pv_mmu_ops __ro_after_init = { EXPORT_SYMBOL_GPL(pv_time_ops); EXPORT_SYMBOL (pv_cpu_ops); EXPORT_SYMBOL (pv_mmu_ops); -EXPORT_SYMBOL_GPL(pv_info); +EXPORT_SYMBOL (pv_info); EXPORT_SYMBOL (pv_irq_ops); diff --git a/arch/x86/kernel/paravirt_full.c b/arch/x86/kernel/paravirt_full.c index b90dfa7428bd..b65d19d8d9d7 100644 --- a/arch/x86/kernel/paravirt_full.c +++ b/arch/x86/kernel/paravirt_full.c @@ -116,6 +116,15 @@ static void native_flush_tlb_single(unsigned long addr) __native_flush_tlb_single(addr); } +struct pvfull_info pvfull_info = { + .kernel_rpl = 0, + .shared_kernel_pmd = 1, /* Only used when CONFIG_X86_PAE is set */ + +#ifdef CONFIG_X86_64 + .extra_user_64bit_cs = __USER_CS, +#endif +}; + __visible struct pvfull_cpu_ops pvfull_cpu_ops = { .cpuid = native_cpuid, .get_debugreg = native_get_debugreg, @@ -262,6 +271,7 @@ NOKPROBE_SYMBOL(native_get_debugreg); NOKPROBE_SYMBOL(native_set_debugreg); NOKPROBE_SYMBOL(native_load_idt); +EXPORT_SYMBOL_GPL(pvfull_info); EXPORT_SYMBOL(pvfull_cpu_ops); EXPORT_SYMBOL_GPL(pvfull_irq_ops); EXPORT_SYMBOL(pvfull_mmu_ops); diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c index b9757853cf79..86b8b1a0c99e 100644 --- a/arch/x86/lguest/boot.c +++ b/arch/x86/lguest/boot.c @@ -1390,9 +1390,9 @@ __init void lguest_init(void) /* We're under lguest. */ pv_info.name = "lguest"; /* We're running at privilege level 1, not 0 as normal. */ - pv_info.kernel_rpl = 1; + pvfull_info.kernel_rpl = 1; /* Everyone except Xen runs with this set. */ - pv_info.shared_kernel_pmd = 1; + pvfull_info.shared_kernel_pmd = 1; /* * We set up all the lguest overrides for sensitive operations. These diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c index 9badad9f82e0..dcf1b4183c49 100644 --- a/arch/x86/xen/enlighten_pv.c +++ b/arch/x86/xen/enlighten_pv.c @@ -1059,13 +1059,12 @@ static unsigned xen_patch(u8 type, u16 clobbers, void *insnbuf, return ret; } -static const struct pv_info xen_info __initconst = { +static const struct pvfull_info xen_info __initconst = { .shared_kernel_pmd = 0, #ifdef CONFIG_X86_64 .extra_user_64bit_cs = FLAT_USER_CS64, #endif - .name = "Xen", }; static const struct pv_init_ops xen_init_ops __initconst = { @@ -1267,7 +1266,8 @@ asmlinkage __visible void __init xen_start_kernel(void) xen_setup_machphys_mapping(); /* Install Xen paravirt ops */ - pv_info = xen_info; + pvfull_info = xen_info; + pv_info.name = "Xen"; pv_init_ops = xen_init_ops; pvfull_cpu_ops = xen_cpu_ops; pv_cpu_ops.io_delay = xen_io_delay; @@ -1358,11 +1358,11 @@ asmlinkage __visible void __init xen_start_kernel(void) /* keep using Xen gdt for now; no urgent need to change it */ #ifdef CONFIG_X86_32 - pv_info.kernel_rpl = 1; + pvfull_info.kernel_rpl = 1; if (xen_feature(XENFEAT_supervisor_mode_kernel)) - pv_info.kernel_rpl = 0; + pvfull_info.kernel_rpl = 0; #else - pv_info.kernel_rpl = 0; + pvfull_info.kernel_rpl = 0; #endif /* set the limit of our address space */ xen_reserve_top(); -- 2.12.0
Juergen Gross
2017-May-19 15:47 UTC
[PATCH 10/10] paravirt: merge pv_ops_* structures into one
As there are now only very few pvops functions left when CONFIG_PARAVIRT_FULL isn't set, merge the related structures into one named "pv_ops". Signed-off-by: Juergen Gross <jgross at suse.com> --- arch/x86/include/asm/paravirt.h | 32 ++++++++++++++++---------------- arch/x86/include/asm/paravirt_types.h | 27 ++++----------------------- arch/x86/kernel/alternative.c | 4 ++-- arch/x86/kernel/asm-offsets.c | 6 +++--- arch/x86/kernel/cpu/vmware.c | 6 +++--- arch/x86/kernel/kvm.c | 6 +++--- arch/x86/kernel/kvmclock.c | 6 +++--- arch/x86/kernel/paravirt.c | 33 +++++---------------------------- arch/x86/kernel/paravirt_patch_32.c | 16 ++++++++-------- arch/x86/kernel/paravirt_patch_64.c | 16 ++++++++-------- arch/x86/kernel/tsc.c | 2 +- arch/x86/kernel/vsmp_64.c | 18 +++++++++--------- arch/x86/lguest/boot.c | 20 ++++++++++---------- arch/x86/xen/enlighten_hvm.c | 4 ++-- arch/x86/xen/enlighten_pv.c | 28 ++++++++++++---------------- arch/x86/xen/irq.c | 12 ++++-------- arch/x86/xen/mmu_hvm.c | 2 +- arch/x86/xen/mmu_pv.c | 4 ++-- arch/x86/xen/time.c | 11 ++++------- drivers/xen/time.c | 2 +- 20 files changed, 101 insertions(+), 154 deletions(-) diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index 55e0c1807df2..0f8194ec64c9 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -28,7 +28,7 @@ static inline enum paravirt_lazy_mode paravirt_get_lazy_mode(void) static inline unsigned long long paravirt_sched_clock(void) { - return PVOP_CALL0(unsigned long long, pv_time_ops.sched_clock); + return PVOP_CALL0(unsigned long long, pv_ops.sched_clock); } struct static_key; @@ -37,23 +37,23 @@ extern struct static_key paravirt_steal_rq_enabled; static inline u64 paravirt_steal_clock(int cpu) { - return PVOP_CALL1(u64, pv_time_ops.steal_clock, cpu); + return PVOP_CALL1(u64, pv_ops.steal_clock, cpu); } /* The paravirtualized I/O functions */ static inline void slow_down_io(void) { - pv_cpu_ops.io_delay(); + pv_ops.io_delay(); #ifdef REALLY_SLOW_IO - pv_cpu_ops.io_delay(); - pv_cpu_ops.io_delay(); - pv_cpu_ops.io_delay(); + pv_ops.io_delay(); + pv_ops.io_delay(); + pv_ops.io_delay(); #endif } static inline void paravirt_arch_exit_mmap(struct mm_struct *mm) { - PVOP_VCALL1(pv_mmu_ops.exit_mmap, mm); + PVOP_VCALL1(pv_ops.exit_mmap, mm); } static inline void flush_tlb_others(const struct cpumask *cpumask, @@ -61,7 +61,7 @@ static inline void flush_tlb_others(const struct cpumask *cpumask, unsigned long start, unsigned long end) { - PVOP_VCALL4(pv_mmu_ops.flush_tlb_others, cpumask, mm, start, end); + PVOP_VCALL4(pv_ops.flush_tlb_others, cpumask, mm, start, end); } #if defined(CONFIG_SMP) && defined(CONFIG_PARAVIRT_SPINLOCKS) @@ -173,22 +173,22 @@ static __always_inline bool pv_vcpu_is_preempted(long cpu) static inline notrace unsigned long arch_local_save_flags(void) { - return PVOP_CALLEE0(unsigned long, pv_irq_ops.save_fl); + return PVOP_CALLEE0(unsigned long, pv_ops.save_fl); } static inline notrace void arch_local_irq_restore(unsigned long f) { - PVOP_VCALLEE1(pv_irq_ops.restore_fl, f); + PVOP_VCALLEE1(pv_ops.restore_fl, f); } static inline notrace void arch_local_irq_disable(void) { - PVOP_VCALLEE0(pv_irq_ops.irq_disable); + PVOP_VCALLEE0(pv_ops.irq_disable); } static inline notrace void arch_local_irq_enable(void) { - PVOP_VCALLEE0(pv_irq_ops.irq_enable); + PVOP_VCALLEE0(pv_ops.irq_enable); } static inline notrace unsigned long arch_local_irq_save(void) @@ -286,15 +286,15 @@ extern void default_banner(void); #endif #define DISABLE_INTERRUPTS(clobbers) \ - PARA_SITE(PARA_PATCH(pv_irq_ops, PV_IRQ_irq_disable), clobbers, \ + PARA_SITE(PARA_PATCH(pv_ops, PV_IRQ_irq_disable), clobbers, \ PV_SAVE_REGS(clobbers | CLBR_CALLEE_SAVE); \ - call PARA_INDIRECT(pv_irq_ops+PV_IRQ_irq_disable); \ + call PARA_INDIRECT(pv_ops+PV_IRQ_irq_disable); \ PV_RESTORE_REGS(clobbers | CLBR_CALLEE_SAVE);) #define ENABLE_INTERRUPTS(clobbers) \ - PARA_SITE(PARA_PATCH(pv_irq_ops, PV_IRQ_irq_enable), clobbers, \ + PARA_SITE(PARA_PATCH(pv_ops, PV_IRQ_irq_enable), clobbers, \ PV_SAVE_REGS(clobbers | CLBR_CALLEE_SAVE); \ - call PARA_INDIRECT(pv_irq_ops+PV_IRQ_irq_enable); \ + call PARA_INDIRECT(pv_ops+PV_IRQ_irq_enable); \ PV_RESTORE_REGS(clobbers | CLBR_CALLEE_SAVE);) #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h index 34753d10ebbc..833529661acb 100644 --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -65,11 +65,9 @@ struct paravirt_callee_save { #endif /* general info */ -struct pv_info { +struct pv_ops { const char *name; -}; -struct pv_init_ops { /* * Patch may replace one of the defined code sequences with * arbitrary code, subject to the same register constraints. @@ -80,18 +78,12 @@ struct pv_init_ops { */ unsigned (*patch)(u8 type, u16 clobber, void *insnbuf, unsigned long addr, unsigned len); -}; -struct pv_time_ops { unsigned long long (*sched_clock)(void); unsigned long long (*steal_clock)(int cpu); -}; -struct pv_cpu_ops { void (*io_delay)(void); -}; -struct pv_irq_ops { /* * Get/set interrupt state. save_fl and restore_fl are only * expected to use X86_EFLAGS_IF; all other bits @@ -105,9 +97,7 @@ struct pv_irq_ops { struct paravirt_callee_save restore_fl; struct paravirt_callee_save irq_disable; struct paravirt_callee_save irq_enable; -}; -struct pv_mmu_ops { void (*exit_mmap)(struct mm_struct *mm); void (*flush_tlb_others)(const struct cpumask *cpus, struct mm_struct *mm, @@ -136,11 +126,7 @@ struct pv_lock_ops { * number for each function using the offset which we use to indicate * what to patch. */ struct paravirt_patch_template { - struct pv_init_ops pv_init_ops; - struct pv_time_ops pv_time_ops; - struct pv_cpu_ops pv_cpu_ops; - struct pv_irq_ops pv_irq_ops; - struct pv_mmu_ops pv_mmu_ops; + struct pv_ops pv_ops; struct pv_lock_ops pv_lock_ops; #ifdef CONFIG_PARAVIRT_FULL struct pvfull_cpu_ops pvfull_cpu_ops; @@ -149,12 +135,7 @@ struct paravirt_patch_template { #endif }; -extern struct pv_info pv_info; -extern struct pv_init_ops pv_init_ops; -extern struct pv_time_ops pv_time_ops; -extern struct pv_cpu_ops pv_cpu_ops; -extern struct pv_irq_ops pv_irq_ops; -extern struct pv_mmu_ops pv_mmu_ops; +extern struct pv_ops pv_ops; extern struct pv_lock_ops pv_lock_ops; #define PARAVIRT_PATCH(x) \ @@ -247,7 +228,7 @@ unsigned native_patch(u8 type, u16 clobbers, void *ibuf, * The call instruction itself is marked by placing its start address * and size into the .parainstructions section, so that * apply_paravirt() in arch/i386/kernel/alternative.c can do the - * appropriate patching under the control of the backend pv_init_ops + * appropriate patching under the control of the backend pv_ops * implementation. * * Unfortunately there's no way to get gcc to generate the args setup diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index c5b8f760473c..ac1a9356616b 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -600,8 +600,8 @@ void __init_or_module apply_paravirt(struct paravirt_patch_site *start, BUG_ON(p->len > MAX_PATCH_LEN); /* prep the buffer with the original instructions */ memcpy(insnbuf, p->instr, p->len); - used = pv_init_ops.patch(p->instrtype, p->clobbers, insnbuf, - (unsigned long)p->instr, p->len); + used = pv_ops.patch(p->instrtype, p->clobbers, insnbuf, + (unsigned long)p->instr, p->len); BUG_ON(used > p->len); diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c index 18a5c06c007a..6a225a90bc31 100644 --- a/arch/x86/kernel/asm-offsets.c +++ b/arch/x86/kernel/asm-offsets.c @@ -64,9 +64,9 @@ void common(void) { #ifdef CONFIG_PARAVIRT BLANK(); - OFFSET(PARAVIRT_PATCH_pv_irq_ops, paravirt_patch_template, pv_irq_ops); - OFFSET(PV_IRQ_irq_disable, pv_irq_ops, irq_disable); - OFFSET(PV_IRQ_irq_enable, pv_irq_ops, irq_enable); + OFFSET(PARAVIRT_PATCH_pv_ops, paravirt_patch_template, pv_ops); + OFFSET(PV_IRQ_irq_disable, pv_ops, irq_disable); + OFFSET(PV_IRQ_irq_enable, pv_ops, irq_enable); #endif #ifdef CONFIG_PARAVIRT_FULL OFFSET(PARAVIRT_PATCH_pvfull_cpu_ops, paravirt_patch_template, diff --git a/arch/x86/kernel/cpu/vmware.c b/arch/x86/kernel/cpu/vmware.c index 40ed26852ebd..6be8af37e227 100644 --- a/arch/x86/kernel/cpu/vmware.c +++ b/arch/x86/kernel/cpu/vmware.c @@ -97,14 +97,14 @@ static void __init vmware_sched_clock_setup(void) d->cyc2ns_offset = mul_u64_u32_shr(tsc_now, d->cyc2ns_mul, d->cyc2ns_shift); - pv_time_ops.sched_clock = vmware_sched_clock; + pv_ops.sched_clock = vmware_sched_clock; pr_info("using sched offset of %llu ns\n", d->cyc2ns_offset); } static void __init vmware_paravirt_ops_setup(void) { - pv_info.name = "VMware hypervisor"; - pv_cpu_ops.io_delay = paravirt_nop; + pv_ops.name = "VMware hypervisor"; + pv_ops.io_delay = paravirt_nop; if (vmware_tsc_khz && vmw_sched_clock) vmware_sched_clock_setup(); diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index da5c09789984..2aefbcea9ae4 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -280,10 +280,10 @@ NOKPROBE_SYMBOL(do_async_page_fault); static void __init paravirt_ops_setup(void) { - pv_info.name = "KVM"; + pv_ops.name = "KVM"; if (kvm_para_has_feature(KVM_FEATURE_NOP_IO_DELAY)) - pv_cpu_ops.io_delay = kvm_io_delay; + pv_ops.io_delay = kvm_io_delay; #ifdef CONFIG_X86_IO_APIC no_timer_check = 1; @@ -467,7 +467,7 @@ void __init kvm_guest_init(void) if (kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) { has_steal_clock = 1; - pv_time_ops.steal_clock = kvm_steal_clock; + pv_ops.steal_clock = kvm_steal_clock; } if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c index d88967659098..d3c92f7ce79b 100644 --- a/arch/x86/kernel/kvmclock.c +++ b/arch/x86/kernel/kvmclock.c @@ -109,13 +109,13 @@ static u64 kvm_sched_clock_read(void) static inline void kvm_sched_clock_init(bool stable) { if (!stable) { - pv_time_ops.sched_clock = kvm_clock_read; + pv_ops.sched_clock = kvm_clock_read; clear_sched_clock_stable(); return; } kvm_sched_clock_offset = kvm_clock_read(); - pv_time_ops.sched_clock = kvm_sched_clock_read; + pv_ops.sched_clock = kvm_sched_clock_read; printk(KERN_INFO "kvm-clock: using sched offset of %llu cycles\n", kvm_sched_clock_offset); @@ -308,7 +308,7 @@ void __init kvmclock_init(void) #endif kvm_get_preset_lpj(); clocksource_register_hz(&kvm_clock, NSEC_PER_SEC); - pv_info.name = "KVM"; + pv_ops.name = "KVM"; } int __init kvm_setup_vsyscall_timeinfo(void) diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index 42da2fde1fef..cde433e495f6 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -63,7 +63,7 @@ u64 notrace _paravirt_ident_64(u64 x) void __init default_banner(void) { printk(KERN_INFO "Booting paravirtualized kernel on %s\n", - pv_info.name); + pv_ops.name); } /* Undefined instruction for dealing with missing ops pointers. */ @@ -114,11 +114,7 @@ unsigned paravirt_patch_jmp(void *insnbuf, const void *target, static void *get_call_destination(u8 type) { struct paravirt_patch_template tmpl = { - .pv_init_ops = pv_init_ops, - .pv_time_ops = pv_time_ops, - .pv_cpu_ops = pv_cpu_ops, - .pv_irq_ops = pv_irq_ops, - .pv_mmu_ops = pv_mmu_ops, + .pv_ops = pv_ops, #ifdef CONFIG_PARAVIRT_SPINLOCKS .pv_lock_ops = pv_lock_ops, #endif @@ -185,37 +181,18 @@ static u64 native_steal_clock(int cpu) return 0; } -struct pv_info pv_info = { +__visible struct pv_ops pv_ops = { .name = "bare hardware", -}; - -struct pv_init_ops pv_init_ops = { .patch = native_patch, -}; - -struct pv_time_ops pv_time_ops = { .sched_clock = native_sched_clock, .steal_clock = native_steal_clock, -}; - -__visible struct pv_irq_ops pv_irq_ops = { + .io_delay = native_io_delay, .save_fl = __PV_IS_CALLEE_SAVE(native_save_fl), .restore_fl = __PV_IS_CALLEE_SAVE(native_restore_fl), .irq_disable = __PV_IS_CALLEE_SAVE(native_irq_disable), .irq_enable = __PV_IS_CALLEE_SAVE(native_irq_enable), -}; - -__visible struct pv_cpu_ops pv_cpu_ops = { - .io_delay = native_io_delay, -}; - -struct pv_mmu_ops pv_mmu_ops __ro_after_init = { .flush_tlb_others = native_flush_tlb_others, .exit_mmap = paravirt_nop, }; -EXPORT_SYMBOL_GPL(pv_time_ops); -EXPORT_SYMBOL (pv_cpu_ops); -EXPORT_SYMBOL (pv_mmu_ops); -EXPORT_SYMBOL (pv_info); -EXPORT_SYMBOL (pv_irq_ops); +EXPORT_SYMBOL(pv_ops); diff --git a/arch/x86/kernel/paravirt_patch_32.c b/arch/x86/kernel/paravirt_patch_32.c index b5f93cb0d05f..48e44290cff0 100644 --- a/arch/x86/kernel/paravirt_patch_32.c +++ b/arch/x86/kernel/paravirt_patch_32.c @@ -1,9 +1,9 @@ #include <asm/paravirt.h> -DEF_NATIVE(pv_irq_ops, irq_disable, "cli"); -DEF_NATIVE(pv_irq_ops, irq_enable, "sti"); -DEF_NATIVE(pv_irq_ops, restore_fl, "push %eax; popf"); -DEF_NATIVE(pv_irq_ops, save_fl, "pushf; pop %eax"); +DEF_NATIVE(pv_ops, irq_disable, "cli"); +DEF_NATIVE(pv_ops, irq_enable, "sti"); +DEF_NATIVE(pv_ops, restore_fl, "push %eax; popf"); +DEF_NATIVE(pv_ops, save_fl, "pushf; pop %eax"); #ifdef CONFIG_PARAVIRT_FULL DEF_NATIVE(pvfull_mmu_ops, read_cr2, "mov %cr2, %eax"); DEF_NATIVE(pvfull_mmu_ops, write_cr3, "mov %eax, %cr3"); @@ -43,10 +43,10 @@ unsigned native_patch(u8 type, u16 clobbers, void *ibuf, end = end_##ops##_##x; \ goto patch_site switch (type) { - PATCH_SITE(pv_irq_ops, irq_disable); - PATCH_SITE(pv_irq_ops, irq_enable); - PATCH_SITE(pv_irq_ops, restore_fl); - PATCH_SITE(pv_irq_ops, save_fl); + PATCH_SITE(pv_ops, irq_disable); + PATCH_SITE(pv_ops, irq_enable); + PATCH_SITE(pv_ops, restore_fl); + PATCH_SITE(pv_ops, save_fl); #ifdef CONFIG_PARAVIRT_FULL PATCH_SITE(pvfull_mmu_ops, read_cr2); PATCH_SITE(pvfull_mmu_ops, read_cr3); diff --git a/arch/x86/kernel/paravirt_patch_64.c b/arch/x86/kernel/paravirt_patch_64.c index 473688054f0b..158943a18ca2 100644 --- a/arch/x86/kernel/paravirt_patch_64.c +++ b/arch/x86/kernel/paravirt_patch_64.c @@ -2,10 +2,10 @@ #include <asm/asm-offsets.h> #include <linux/stringify.h> -DEF_NATIVE(pv_irq_ops, irq_disable, "cli"); -DEF_NATIVE(pv_irq_ops, irq_enable, "sti"); -DEF_NATIVE(pv_irq_ops, restore_fl, "pushq %rdi; popfq"); -DEF_NATIVE(pv_irq_ops, save_fl, "pushfq; popq %rax"); +DEF_NATIVE(pv_ops, irq_disable, "cli"); +DEF_NATIVE(pv_ops, irq_enable, "sti"); +DEF_NATIVE(pv_ops, restore_fl, "pushq %rdi; popfq"); +DEF_NATIVE(pv_ops, save_fl, "pushfq; popq %rax"); DEF_NATIVE(, mov32, "mov %edi, %eax"); DEF_NATIVE(, mov64, "mov %rdi, %rax"); @@ -52,10 +52,10 @@ unsigned native_patch(u8 type, u16 clobbers, void *ibuf, end = end_##ops##_##x; \ goto patch_site switch(type) { - PATCH_SITE(pv_irq_ops, restore_fl); - PATCH_SITE(pv_irq_ops, save_fl); - PATCH_SITE(pv_irq_ops, irq_enable); - PATCH_SITE(pv_irq_ops, irq_disable); + PATCH_SITE(pv_ops, restore_fl); + PATCH_SITE(pv_ops, save_fl); + PATCH_SITE(pv_ops, irq_enable); + PATCH_SITE(pv_ops, irq_disable); #ifdef CONFIG_PARAVIRT_FULL PATCH_SITE(pvfull_mmu_ops, read_cr2); PATCH_SITE(pvfull_mmu_ops, read_cr3); diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index 714dfba6a1e7..678fc8923cb8 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -330,7 +330,7 @@ unsigned long long sched_clock(void) bool using_native_sched_clock(void) { - return pv_time_ops.sched_clock == native_sched_clock; + return pv_ops.sched_clock == native_sched_clock; } #else unsigned long long diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c index b034b1b14b9c..8e003dac79fb 100644 --- a/arch/x86/kernel/vsmp_64.c +++ b/arch/x86/kernel/vsmp_64.c @@ -76,10 +76,10 @@ static unsigned __init vsmp_patch(u8 type, u16 clobbers, void *ibuf, unsigned long addr, unsigned len) { switch (type) { - case PARAVIRT_PATCH(pv_irq_ops.irq_enable): - case PARAVIRT_PATCH(pv_irq_ops.irq_disable): - case PARAVIRT_PATCH(pv_irq_ops.save_fl): - case PARAVIRT_PATCH(pv_irq_ops.restore_fl): + case PARAVIRT_PATCH(pv_ops.irq_enable): + case PARAVIRT_PATCH(pv_ops.irq_disable): + case PARAVIRT_PATCH(pv_ops.save_fl): + case PARAVIRT_PATCH(pv_ops.restore_fl): return paravirt_patch_default(type, clobbers, ibuf, addr, len); default: return native_patch(type, clobbers, ibuf, addr, len); @@ -117,11 +117,11 @@ static void __init set_vsmp_pv_ops(void) if (cap & ctl & (1 << 4)) { /* Setup irq ops and turn on vSMP IRQ fastpath handling */ - pv_irq_ops.irq_disable = PV_CALLEE_SAVE(vsmp_irq_disable); - pv_irq_ops.irq_enable = PV_CALLEE_SAVE(vsmp_irq_enable); - pv_irq_ops.save_fl = PV_CALLEE_SAVE(vsmp_save_fl); - pv_irq_ops.restore_fl = PV_CALLEE_SAVE(vsmp_restore_fl); - pv_init_ops.patch = vsmp_patch; + pv_ops.irq_disable = PV_CALLEE_SAVE(vsmp_irq_disable); + pv_ops.irq_enable = PV_CALLEE_SAVE(vsmp_irq_enable); + pv_ops.save_fl = PV_CALLEE_SAVE(vsmp_save_fl); + pv_ops.restore_fl = PV_CALLEE_SAVE(vsmp_restore_fl); + pv_ops.patch = vsmp_patch; ctl &= ~(1 << 4); } writel(ctl, address + 4); diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c index 86b8b1a0c99e..ccb6647a5167 100644 --- a/arch/x86/lguest/boot.c +++ b/arch/x86/lguest/boot.c @@ -1351,8 +1351,8 @@ static const struct lguest_insns { const char *start, *end; } lguest_insns[] = { - [PARAVIRT_PATCH(pv_irq_ops.irq_disable)] = { lgstart_cli, lgend_cli }, - [PARAVIRT_PATCH(pv_irq_ops.save_fl)] = { lgstart_pushf, lgend_pushf }, + [PARAVIRT_PATCH(pv_ops.irq_disable)] = { lgstart_cli, lgend_cli }, + [PARAVIRT_PATCH(pv_ops.save_fl)] = { lgstart_pushf, lgend_pushf }, }; /* @@ -1388,7 +1388,10 @@ static unsigned lguest_patch(u8 type, u16 clobber, void *ibuf, __init void lguest_init(void) { /* We're under lguest. */ - pv_info.name = "lguest"; + pv_ops.name = "lguest"; + /* Setup operations */ + pv_ops.patch = lguest_patch; + /* We're running at privilege level 1, not 0 as normal. */ pvfull_info.kernel_rpl = 1; /* Everyone except Xen runs with this set. */ @@ -1400,15 +1403,12 @@ __init void lguest_init(void) */ /* Interrupt-related operations */ - pv_irq_ops.save_fl = PV_CALLEE_SAVE(lguest_save_fl); - pv_irq_ops.restore_fl = __PV_IS_CALLEE_SAVE(lg_restore_fl); - pv_irq_ops.irq_disable = PV_CALLEE_SAVE(lguest_irq_disable); - pv_irq_ops.irq_enable = __PV_IS_CALLEE_SAVE(lg_irq_enable); + pv_ops.save_fl = PV_CALLEE_SAVE(lguest_save_fl); + pv_ops.restore_fl = __PV_IS_CALLEE_SAVE(lg_restore_fl); + pv_ops.irq_disable = PV_CALLEE_SAVE(lguest_irq_disable); + pv_ops.irq_enable = __PV_IS_CALLEE_SAVE(lg_irq_enable); pvfull_irq_ops.safe_halt = lguest_safe_halt; - /* Setup operations */ - pv_init_ops.patch = lguest_patch; - /* Intercepts of various CPU instructions */ pvfull_cpu_ops.load_gdt = lguest_load_gdt; pvfull_cpu_ops.cpuid = lguest_cpuid; diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c index a6d014f47e52..03165e3101f2 100644 --- a/arch/x86/xen/enlighten_hvm.c +++ b/arch/x86/xen/enlighten_hvm.c @@ -69,12 +69,12 @@ static void __init init_hvm_pv_info(void) /* PVH set up hypercall page in xen_prepare_pvh(). */ if (xen_pvh_domain()) - pv_info.name = "Xen PVH"; + pv_ops.name = "Xen PVH"; else { u64 pfn; uint32_t msr; - pv_info.name = "Xen HVM"; + pv_ops.name = "Xen HVM"; msr = cpuid_ebx(base + 2); pfn = __pa(hypercall_page); wrmsr_safe(msr, (u32)pfn, (u32)(pfn >> 32)); diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c index dcf1b4183c49..9fa6698f2f26 100644 --- a/arch/x86/xen/enlighten_pv.c +++ b/arch/x86/xen/enlighten_pv.c @@ -144,7 +144,7 @@ static void __init xen_banner(void) pr_info("Booting paravirtualized kernel %son %s\n", xen_feature(XENFEAT_auto_translated_physmap) ? - "with PVH extensions " : "", pv_info.name); + "with PVH extensions " : "", pv_ops.name); printk(KERN_INFO "Xen version: %d.%d%s%s\n", version >> 16, version & 0xffff, extra.extraversion, xen_feature(XENFEAT_mmu_pt_update_preserve_ad) ? " (preserve-AD)" : ""); @@ -998,10 +998,10 @@ void xen_setup_vcpu_info_placement(void) * percpu area for all cpus, so make use of it. */ if (xen_have_vcpu_info_placement) { - pv_irq_ops.save_fl = __PV_IS_CALLEE_SAVE(xen_save_fl_direct); - pv_irq_ops.restore_fl = __PV_IS_CALLEE_SAVE(xen_restore_fl_direct); - pv_irq_ops.irq_disable = __PV_IS_CALLEE_SAVE(xen_irq_disable_direct); - pv_irq_ops.irq_enable = __PV_IS_CALLEE_SAVE(xen_irq_enable_direct); + pv_ops.save_fl = __PV_IS_CALLEE_SAVE(xen_save_fl_direct); + pv_ops.restore_fl = __PV_IS_CALLEE_SAVE(xen_restore_fl_direct); + pv_ops.irq_disable = __PV_IS_CALLEE_SAVE(xen_irq_disable_direct); + pv_ops.irq_enable = __PV_IS_CALLEE_SAVE(xen_irq_enable_direct); pvfull_mmu_ops.read_cr2 = xen_read_cr2_direct; } } @@ -1024,10 +1024,10 @@ static unsigned xen_patch(u8 type, u16 clobbers, void *insnbuf, goto patch_site switch (type) { - SITE(pv_irq_ops, irq_enable); - SITE(pv_irq_ops, irq_disable); - SITE(pv_irq_ops, save_fl); - SITE(pv_irq_ops, restore_fl); + SITE(pv_ops, irq_enable); + SITE(pv_ops, irq_disable); + SITE(pv_ops, save_fl); + SITE(pv_ops, restore_fl); #undef SITE patch_site: @@ -1067,10 +1067,6 @@ static const struct pvfull_info xen_info __initconst = { #endif }; -static const struct pv_init_ops xen_init_ops __initconst = { - .patch = xen_patch, -}; - static const struct pvfull_cpu_ops xen_cpu_ops __initconst = { .cpuid = xen_cpuid, @@ -1267,10 +1263,10 @@ asmlinkage __visible void __init xen_start_kernel(void) /* Install Xen paravirt ops */ pvfull_info = xen_info; - pv_info.name = "Xen"; - pv_init_ops = xen_init_ops; + pv_ops.name = "Xen"; + pv_ops.patch = xen_patch; + pv_ops.io_delay = xen_io_delay; pvfull_cpu_ops = xen_cpu_ops; - pv_cpu_ops.io_delay = xen_io_delay; x86_platform.get_nmi_reason = xen_get_nmi_reason; diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c index c9dba9d8cecf..eeced2f4ccb6 100644 --- a/arch/x86/xen/irq.c +++ b/arch/x86/xen/irq.c @@ -115,13 +115,6 @@ static void xen_halt(void) xen_safe_halt(); } -static const struct pv_irq_ops xen_irq_ops __initconst = { - .save_fl = PV_CALLEE_SAVE(xen_save_fl), - .restore_fl = PV_CALLEE_SAVE(xen_restore_fl), - .irq_disable = PV_CALLEE_SAVE(xen_irq_disable), - .irq_enable = PV_CALLEE_SAVE(xen_irq_enable), -}; - static const struct pvfull_irq_ops xen_full_irq_ops __initconst = { .safe_halt = xen_safe_halt, .halt = xen_halt, @@ -132,7 +125,10 @@ static const struct pvfull_irq_ops xen_full_irq_ops __initconst = { void __init xen_init_irq_ops(void) { - pv_irq_ops = xen_irq_ops; + pv_ops.save_fl = PV_CALLEE_SAVE(xen_save_fl); + pv_ops.restore_fl = PV_CALLEE_SAVE(xen_restore_fl); + pv_ops.irq_disable = PV_CALLEE_SAVE(xen_irq_disable); + pv_ops.irq_enable = PV_CALLEE_SAVE(xen_irq_enable); pvfull_irq_ops = xen_full_irq_ops; x86_init.irqs.intr_init = xen_init_IRQ; } diff --git a/arch/x86/xen/mmu_hvm.c b/arch/x86/xen/mmu_hvm.c index 1c57f1cd545c..bf6472b444c5 100644 --- a/arch/x86/xen/mmu_hvm.c +++ b/arch/x86/xen/mmu_hvm.c @@ -72,7 +72,7 @@ static int is_pagetable_dying_supported(void) void __init xen_hvm_init_mmu_ops(void) { if (is_pagetable_dying_supported()) - pv_mmu_ops.exit_mmap = xen_hvm_exit_mmap; + pv_ops.exit_mmap = xen_hvm_exit_mmap; #ifdef CONFIG_PROC_VMCORE register_oldmem_pfn_is_ram(&xen_oldmem_pfn_is_ram); #endif diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index 7be3e21a4dac..89da3c2b8248 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -2513,8 +2513,8 @@ void __init xen_init_mmu_ops(void) return; pvfull_mmu_ops = xen_mmu_ops; - pv_mmu_ops.flush_tlb_others = xen_flush_tlb_others; - pv_mmu_ops.exit_mmap = xen_exit_mmap; + pv_ops.flush_tlb_others = xen_flush_tlb_others; + pv_ops.exit_mmap = xen_exit_mmap; memset(dummy_mapping, 0xff, PAGE_SIZE); } diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c index a1895a8e85c1..c5f7e5e6eea6 100644 --- a/arch/x86/xen/time.c +++ b/arch/x86/xen/time.c @@ -366,11 +366,6 @@ void xen_timer_resume(void) } } -static const struct pv_time_ops xen_time_ops __initconst = { - .sched_clock = xen_clocksource_read, - .steal_clock = xen_steal_clock, -}; - static void __init xen_time_init(void) { int cpu = smp_processor_id(); @@ -408,7 +403,8 @@ static void __init xen_time_init(void) void __ref xen_init_time_ops(void) { - pv_time_ops = xen_time_ops; + pv_ops.sched_clock = xen_clocksource_read; + pv_ops.steal_clock = xen_steal_clock; x86_init.timers.timer_init = xen_time_init; x86_init.timers.setup_percpu_clockev = x86_init_noop; @@ -450,7 +446,8 @@ void __init xen_hvm_init_time_ops(void) return; } - pv_time_ops = xen_time_ops; + pv_ops.sched_clock = xen_clocksource_read; + pv_ops.steal_clock = xen_steal_clock; x86_init.timers.setup_percpu_clockev = xen_time_init; x86_cpuinit.setup_percpu_clockev = xen_hvm_setup_cpu_clockevents; diff --git a/drivers/xen/time.c b/drivers/xen/time.c index ac5f23fcafc2..37c355734bff 100644 --- a/drivers/xen/time.c +++ b/drivers/xen/time.c @@ -106,7 +106,7 @@ void __init xen_time_setup_guest(void) xen_runstate_remote = !HYPERVISOR_vm_assist(VMASST_CMD_enable, VMASST_TYPE_runstate_update_flag); - pv_time_ops.steal_clock = xen_steal_clock; + pv_ops.steal_clock = xen_steal_clock; static_key_slow_inc(¶virt_steal_enabled); if (xen_runstate_remote) -- 2.12.0
Boris Ostrovsky
2017-May-22 19:42 UTC
[PATCH 00/10] paravirt: make amount of paravirtualization configurable
> 49 files changed, 1548 insertions(+), 1477 deletions(-) > create mode 100644 arch/x86/include/asm/paravirt_full.h > create mode 100644 arch/x86/include/asm/paravirt_types_full.h > create mode 100644 arch/x86/kernel/paravirt_full.cDo you have this in a tree that can be pulled? -boris
Juergen Gross
2017-May-23 06:27 UTC
[PATCH 00/10] paravirt: make amount of paravirtualization configurable
On 22/05/17 21:42, Boris Ostrovsky wrote:> >> 49 files changed, 1548 insertions(+), 1477 deletions(-) >> create mode 100644 arch/x86/include/asm/paravirt_full.h >> create mode 100644 arch/x86/include/asm/paravirt_types_full.h >> create mode 100644 arch/x86/kernel/paravirt_full.c > > > Do you have this in a tree that can be pulled?https://github.com/jgross1/linux pvops Juergen
Boris Ostrovsky
2017-May-24 15:40 UTC
[PATCH 05/10] paravirt: add new PARAVIRT_FULL config item
On 05/19/2017 11:47 AM, Juergen Gross wrote:> Add a new config item PARAVIRT_FULL. It will be used to guard the > pv_*_ops functions used by fully paravirtualized guests (Xen pv-guests > and lguest) only. > > Kernels not meant to support those guest types will be able to use many > operations without paravirt abstraction while still supporting all the > other paravirt features. > > For now just add the new Kconfig option and select it for XEN_PV and > LGUEST_GUEST. Add paravirt_full.c, paravirt_full.h and > paravirt_types_full.h which will contain the necessary implementation > parts of the pv guest specific paravirt functions.Is it not possible to just 'ifdef CONFIG_PARAVIT_FULL' the (ir)relevant parts of paravirt.[ch] and paravirt_types.c? Separating structures and files into pv and pvfull seems somewhat arbitrary (.flush_tlb_others in patch 8 being a good example of one type of guest deciding to use something that normally would be considered part of a pvfull-type structure). -boris
Apparently Analagous Threads
- [PATCH v3 47/75] x86/sev-es: Add Runtime #VC Exception Handler
- [PATCH -tip RFC v2 01/22] kprobes: Prohibit probing on .entry.text code
- [PATCH v6 48/76] x86/entry/64: Add entry code for #VC handler
- Should SEV-ES #VC use IST? (Re: [PATCH] Allow RDTSC and RDTSCP from userspace)
- [PATCH] Allow RDTSC and RDTSCP from userspace